-
-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running on Windows error: downloading with rowLimit 5000+ returns "Error in if (s[length(s)] == "") s <- s[-length(s)]" for particular dates #43
Comments
Thanks, I think this may be an issue where |
That's interesting. I tried the same code as above with This one day range is because the process tries to find missing data in database and download it. It tries to download data by each single day as I cannot be sure missing data will always be in a single period instead of "random" dates. Nevertheless, I tried to download this data from a period to check if it solves the problem: from 2017-03-17 to 2017-03-19
However, it does not return similar number of rows. Quick check with
Today is the last day to check it (2017-03-17 is the oldest date in new Search Console) but as I see in Search Console there is more than 999 rows of data (queries + clicks & impressions) so it seems to be a mistake. I also tried with different |
Hi @MarkEdmondson1234 - think I'm having the same issue or a similar one as @Leszek-Sieminski-PM had. Hopefully you can find the issue on my side. Code
Console Output
Traceback (same as @Leszek-Sieminski-PM)
Session InfoLatest searchConsoleR and googleAuthR from github, latest R Version
|
Sorry for closing issue, misclick. @kirchnerto I later discovered that my database misses 46 days of data from last 16 months because of this issue. This problem does not appear if you use Python instead (for example: https://moz.com/blog/how-to-get-search-console-data-api-python) It seems to me that the problem is the googleAuthR helper function. |
@Leszek-Sieminski-PM Thanks for the tip! |
I'll take a look, if I can get it to reproduce myself its a lot easier. I think its to do with some days not having data, so it needs to fail more gracefully. Does that sound possible? |
As I undestand this issue, better error handling would be nice, but the real problem is that the data is present and available traditionally and through API (I downloaded it with both PHP and Python just to check) but R somehow cannot download some dates. |
It looks like it downloads it, but the merge fails. |
The original issue looks like it is before the raise from 5000 to 25000 in the API response Fetching search analytics for url: 'XXX' dates: 2017-03-17 2017-03-17 dimensions: date device page query dimensionFilterExp: searchType: web aggregationType: auto
Batching data via method: byBatch
With rowLimit set to 6000 will need up to [2] API calls
2018-07-12 15:17:29> Batch API limited to [3] calls at once. I suppose the issue should repeat though if you put in a rowLimit of 26000? |
Hmm so the error rises here when parsing the batched responses metadata, not the data itself: responses_meta <- lapply(responses, function(x){
index <- c(1:2)
unlist(split_vector(x, index, remove_splits = FALSE))
}) The API is sent through the batching service of Google, which lets you send many calls at once for faster response, e.g. it should now fetch 75000 rows per API call. The batch response is a split of all the separate API calls, however in this case no header information is being passed back, perhaps because those responses have no data at all. |
If you can install the latest version of You can open that file with |
Hi @MarkEdmondson1234 - thanks for the fast response! Console Output
I attached the .rds-file to this comment. Yeah - it's possible because the output is empty somehow. Hope this helps for debugging. |
Ok well thats weird, the file works when I do it. Hmm, I hope its not a Windows thing. |
@MarkEdmondson1234 What do you mean by saying One thing I noticed: The request only crashes when a lot of dimensions are set which lead to a hell of processing I guess. When just using the dimensions "date" and "query" or "date" and "page" it's working fine but crashes when I want to have all 3 dimensions. |
@MarkEdmondson1234 Anything new on this problem? |
When I loaded the |
@MarkEdmondson1234 I'm running on Windows 10 with intel i5 and 8GB of RAM - should be enough I guess ;) |
@MarkEdmondson1234 I'm running on Windows 10, intel i5, 8GB RAM (for development), but I started the issue after discovering missing data that was downloaded on server (Debian, 32 GB RAM). So it probably isn't related to OS, not sure about the RAM. |
That should be plenty. Sorry I have no clue at the moment as it’s working on my test suite and locally. |
Hello @MarkEdmondson1234, I have been experiencing the same issue as described above. I am also running R on Windows (inside RStudio), and I believe I can confirm that this is a Windows specific problem. I have tried to lower the value of However, since you mentioned that this could be a Windows specific problem, I tried running the exact same scripts using the rocker/verse image in Docker, and there you go: I never got any error and am now able to export all the data I need! I hope this helps. Many thanks for your work. |
Thanks @flopont thats very helpful. I will look to update using the latest googleAuthR tools that may help solves this. |
Hi! This is my first issue so sorry for any mistakes or lacking info. I'll be glad to provide further info.
What goes wrong
First of all I'm afraid this error might not be fully reproducible and I'm sorry for that. I have set of dates and want to use them to download search console data (in a loop). Real examples:
Everything seems fine for all dates when I download with rowLimit <= 5000 and walk_data = c("byBatch).
Increasing rowLimit above 5000 on "2017-03-17" works perfectly fine.
Unfortunately, increasing rowLimit on "2017-03-18" produces an error :
Error in if (s[length(s)] == "") s <- s[-length(s)]
It's strange because I checked manually data in Search Console and it seems that dates producing this error are normal - there is data for each one of them. I suppose this might be somehow connected to this particular website, but I cannot provide its address or tokens.
Code
Actual output
authetication
no problem ("2017-03-17" and rowLimit above 5000)
still no problem (changed date to "2017-03-18" and decreased rowLimit to 5000)
problem ("2018-03-18" and rowLimit > 5000)
Traceback
Session Info
In the beginning I used current versions of googleAuthR and searchConsoleR from CRAN. Changing to github version didn't solve the problem.
The text was updated successfully, but these errors were encountered: