-
-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Querying one day of data at a time only gives 5000 rows #48
Comments
Another important note is that I believe this has implications to batching "byDate", as a similar 5000 row limit is reached per day, even though the package states that 25000 rows are being fetched. |
I can't reproduce this, it gets 25000 rows per batch for me when I use my_example <- "http://www.example.co.uk"
sa2 <- search_analytics(my_example, startDate = Sys.Date() - 10,
dimensions = c("date","device", "country" ,"query","page"),
walk_data = "byBatch", rowLimit = 50000)
# 50000 rows
nrow(sa2)
sa3 <- search_analytics(my_example, startDate = Sys.Date() - 5, endDate = Sys.Date() - 3
dimensions = c("date","device", "country" ,"query","page"),
walk_data = "byDate")
# 75000 rows
nrow(sa3) |
I get your outputs when I include all of the dimensions you do, but try running your query again with just the "date" and "query" dimensions. |
Yes I see now: sa2 <- search_analytics(my_example, startDate = Sys.Date() - 5,dimensions = c("date","query"), walk_data = "byDate")
Fetching search analytics for url: https://www.world-first.co.uk/ dates: 2018-12-14 2018-12-16 dimensions: date query dimensionFilterExp: searchType: web aggregationType: auto
Batching data via method: byDate
Will fetch up to 25000 rows per day
2018-12-19 15:19:14> Request #: 2018-12-14
2018-12-19 15:19:17> Request #: 2018-12-15
2018-12-19 15:19:19> Request #: 2018-12-16
# 15000 rows
nrow(sa2) Hmm, well there is nothing in the code that does this so I guess its the API itself limiting the results when you just query those dimensions. If thats true a Python call will return similar, perhaps it should be lodged as a bug with the Search Console API team if its verified. |
Yeah. I just ran a test w/ Python and got the same. Weird. I don't recall this being an issue before. |
What goes wrong
When running search_analytics on 1 day, row_limit appears to cap out at 5,000 rows.
I know an issue regarding 5000 rows was created a few years ago, but this might be a different problem since Google recently upped the max rowLimit to 25,000.
Steps to reproduce the problem
searchConsoleR version 0.3.0.9000
googleAuthR version 0.7.0.9000
uri <- "https://www.mydomain.com/"
start <- Sys.Date() - 4
end <- Sys.Date() - 4
dims <- c('query')
listwebs <- list_websites()
data <- search_analytics(siteURL = uri,
startDate = start,
endDate = end,
dimensions = dims,
rowLimit = 25000)
Expected output
data.frame with more than 5,000 obs.
Actual output
data.frame with exactly 5,000 obs.
I have tried with multiple domains, and it outputs 5,000 rows every time.
Verbose output:
Fetching search analytics for url: https://www.mydomain.com/ dates: 2018-12-14 2018-12-14 dimensions: query dimensionFilterExp: searchType: web aggregationType: auto
2018-12-18 16:15:05> Token exists.
2018-12-18 16:15:05> Request: https://www.googleapis.com/webmasters/v3/sites/https%3A%2F%2Fwww.mydomain.com%2F/searchAnalytics/query
2018-12-18 16:15:05> Body JSON parsed to: {"startDate":"2018-12-14","endDate":"2018-12-14","dimensions":["query"],"searchType":"web","dimensionFilterGroups":[{"groupType":"and","filters":[]}],"aggregationType":"auto","rowLimit":25000}
Session Info
R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS 10.14.2
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] searchConsoleR_0.3.0.9000
loaded via a namespace (and not attached):
[1] rstudioapi_0.8 magrittr_1.5 R6_2.3.0 httr_1.4.0
[5] tools_3.5.1 pkgbuild_1.0.2 cli_1.0.1 googleAuthR_0.7.0.9000
[9] withr_2.1.2 remotes_2.0.2 openssl_1.1 yaml_2.2.0
[13] assertthat_0.2.0 digest_0.6.18 rprojroot_1.3-2 crayon_1.3.4
[17] processx_3.2.1 callr_3.1.0 ps_1.2.1 curl_3.2
[21] memoise_1.1.0 compiler_3.5.1 backports_1.1.3 prettyunits_1.0.2
[25] jsonlite_1.6
The text was updated successfully, but these errors were encountered: