-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Progress when running a query. #260
Comments
Dev Note:
More worried about count operation ... hopefully it won't be just as expensive as the actual retrieval in extreme cases @akmiller01 |
https://ddw.devinit.org/queries/build/702/ This selects 1 activity |
Created a copy of it ... took 3 secs to save, 45 secs to load the first 10 rows, 1 min & 24 secs to export ... For a query with only 60 rows, that's really terrible performance ... I don't think anything on the frontend can help here, not by much ... This one took about 3 minutes to save @wakibi @akmiller01 ideas? |
Possible yes to both questions @bill-anderson, the debate will be more about trade offs for performance (and extra time) waiting for the two to run prior to running the actual query. @akmiller01 your thoughts? |
I'm not actually sure about the feasibility of a progress bar. The biggest component of time is the SQL query running, and I don't think it's possible to assess progress. While doing some tests yesterday, I actually found it's faster to preview a query that returns lots of results as opposed to a query that returns less than 10 rows. I believe this is because if you say "Select all limit 10" SQL will quickly access the first 10 rows and quickly print them. But if you say "Select where XXX limit 10" and the query only ever returns 9 rows, it will actively scan through the entire table until the end (it doesn't stop itself like the large query). |
@akmiller01 it's possible to get an approximation of how long a query will take. |
If it's possible to calculate an approximation of how long a query will take quickly, it might be worth it. But otherwise if calculating the approximation takes a considerable amount of time itself, it would be counter-productive. I'll run the iati_identifier indexer tonight, and we can see if that improves load-times where the only filter is an iati_identifier. If we can reduce load-times overall, we won't need to bother with a progress bar. |
It would be good to have a time estimate for how long it would take to do this piece of work before we start on it. |
I think we need to understand how indexing might help before doing any work on this. |
@akmiller01 @bill-anderson do I proceed with this now that we have done some optimisations and indexing to the DB? |
Hi @wakibi between all the development tasks, indexing got deployed but never implemented. The heavy-lifting index on IATI identifier ( And then maybe we can collect some stress-testing stats? |
Is someone working on this at the moment? If not can we move it back into @wakibi do you know? |
I am working on this, had forgotten about the second part - showing total record count. Factoring in that now |
@wakibi which PR does this one link to? |
@akmiller01 would you have time to review this so we can deploy? |
The BE of this has been done but there is no FE yet. @edwinmp to look at the FE |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
@wakibi I don't think using an API here is going to do the trick as its run duration is not much different from the duration of the actual query. Also it seems to only work for simple queries and returns the error below for more complex ones (likely cos it returns no results for the analyse query) For when it works, I suggest this runtime be calculated when the dataset is saved (asynchronously) and stored in a property of its own. That way, it's always available before the query is run and can be passed to a progress indicator. |
I saw the error when I was testing the sub-query feature, it's due to queries that have aggregate functions. Got a fix for it (but it may have to come via the sub-query feature). About saving, sounds a good idea but the timing would be wrong for when the server is loaded. Will perhaps run it over a periods of 3 times and get the average, then store that. I think it's the only work around it. |
The text was updated successfully, but these errors were encountered: