Get your own copy of wikidata #584
Replies: 4 comments 2 replies
-
Here is a formatted version of the csv file with the current state of the script: |
Beta Was this translation helpful? Give feedback.
-
@WolfgangFahl I'm really interested in this work, thank you. How would you tackle incremental updates after the initial load? |
Beta Was this translation helpful? Give feedback.
-
@bonelli Thx for getting in touch. QLever is officially in the wikidata short list see the discussion in https://www.wikidata.org/wiki/Category:WDQS_backend_update and the details in https://www.wikidata.org/wiki/Wikidata:SPARQL_query_service/WDQS_backend_update/WDQS_backend_alternatives The roadmap for QLever is something I'd be interested in myself. Hannah Bast might be able to shed some light on that question. As far as i understand the incrementatl update feature is part of that roadmap. |
Beta Was this translation helpful? Give feedback.
-
@bonelli There is a page on the QLever Wiki describing how we will implement SPARQL 1.1 Updates: https://github.com/ad-freiburg/qlever/wiki/QLever-support-for-SPARQL-1.1-Update . There are also other pages on the Wiki describing the current status of QLever. I am continuously adding more. If you have a specific request and it's urgent for you, please let us know. Maybe we can implement it quickly or help you to implement it yourself and create a pull request. Several small features that are currently still missing are relatively easy to implement. I have a question concerning your update remark. QLever can load/index data very fast. A complete index for the complete Wikidata can be built in 14 hours on a standard PC (AMD Ryzen 9 5900X, 128 GB RAM, 2 TB HDD not even SSD), even faster if you have better hardware. That is, you can easily rebuild the index every day. This is much faster than Wikidata provides new data dumps on https://dumps.wikimedia.org/wikidatawiki, which is every 10 days or so. And in our experience that is good enough for almost all users. If that's not good enough for you, we would like to understand why. We will eventually implement SPARQL 1.1 Update, as described on the Wiki, probably some time this year. But it will not happen in the next few weeks. |
Beta Was this translation helpful? Give feedback.
-
https://wiki.bitplan.com/index.php/WikiData_Import_2022-01-29 now has a report of the successful indexing.
Thx for making this happen. I'll now try to run an endpoint on this dataset and see what queries will work.
Here is a first draft of a script to analyze to log file with a spreadsheet - i had to use Excel this time since libreoffice wouldn't work as expected on my Mac. The formulas might not be compatible with libreoffice.
This script could be extended to do ETA calculations and still needs some love for mill/h calculation per phase an total.
Beta Was this translation helpful? Give feedback.
All reactions