-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
can I implement bulkLoad for BQ ? #288
Comments
We could have a DatabaseConnectorDbiConnection that wraps bigrquery connections similar to how DatabaseConnector wraps other DBI drivers. I think this would make sense. |
Thanks @ablack3, https://github.com/FINNGEN/DatabaseConnector It worked very well, but the code is very dirty, I could work on making this the right way if you think that PR will be accepted ??? |
that hack was very simple, just used bigrquery to make the DBI connection following how DatabaseConnector uses SQLite. The dirty part comes, where I had to pass some extra parameters for bigrquery, I just added these paramenters to all the necessary functions, this should be different. |
@ablack3: If bigrquery has superior performance to the JDBC driver, maybe we should swap them out in DatabaseConnector? (after we've completed DatabaseConnector 7.0.0, when this should be simpler) |
@javier-gracia-tabuenca-tuni I think we could follow the pattern used for odbc that uses Otherwise your changes look good to me.
|
Hi @javier-gracia-tabuenca-tuni, after talking with @schuemie we were thinking that after v7 you should be able to use DatabaseConnector with bigrquery by using Ideally DatabaseConnector only uses one driver for each database. Currently that is jdbc for BigQuery but it could be bigrquery in the future if all the tools work well with it. I would say we could give bigrquery a try and compare to the jdbc driver. If bigrquery seems to work better then we can switch to it in a future release. Feel free to make a branch with this change and we can test it out after v7. |
ok, sounds good one more question related to BQ. There is this forced delay for only bigquery I coudnt find what is the reason for this. |
BigQuery has a rate limit, especially for inserts. If you exceed it, you will get an error. To my knowledge this limit still applies. |
@javier-gracia-tabuenca-tuni |
In the end what I did is to connect to BQ ussing the Line 198 in 46eb774
Then I only had to make few changes in the https://github.com/javier-gracia-tabuenca-tuni/DatabaseConnector/tree/BQ-DBI I havent make unit test as I dont have access to your testing enviroment, but we are using it in production of our app (https://github.com/finngen/CohortOperations2) with no problmes. please fill free to use or reach me for more |
forgot something important eg
|
Loading a table in BigQuery with DatabaseConnector::insertTable is quite slow
I usually have to use package bigrquery
quick dirty demo
If you want I can make insertTable with bulkLoad=TRUE to work also BQ
using bigrquery, or do you prefer other method such command line, may be more complex but nor need an extra package ??
The text was updated successfully, but these errors were encountered: