-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PROJECT IDEA] Flattened document-link #26
Comments
Hi @stevieflow, there are 1,966,749 document-links published across all activities, of which 392,533 are unique urls so this is a massive query. This started from me wanting to find out if this was possible via dquery so hope it's ok to share! Getting a list of flattened document-links is expensive but doable. However, including Here is the first 100 results including iati-acitivity elements. We found that scaling this query doesn't work and downloads time out as there's too much data to comb through. Here is the first 10000 results for a much simpler document-link url query. There are probably much more efficient ways to write the query code but this was a first pass :) |
So we looked at optimising the query and this seems to work. Here is the first 100,000 activities. Again, you don't have to run the query, just click on 'Download XSON' to download the data in various formats. |
Rationale
There are thousands of document-links available via IATI data. These all have a range of metadata (both via the
document-link
directly, but also the parentiati-activity
), which might be useful for people to access, for a variety of reasonsProposal
A service (perhaps built off Classic + iati-flattener: iati-data-access/iati-flattener#1?) to output a list of document-link items. This would pull in relevant data from both the specific document-link element, but also elements such as reporting-org (name; ref; type); recipient-country; sector(s); activity-status.
Users could query and get this list in spreadsheet, JSON and XML format
The text was updated successfully, but these errors were encountered: