-
-
Notifications
You must be signed in to change notification settings - Fork 878
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added options immedate & replace to eng_SQL #2128
base: master
Are you sure you want to change the base?
Conversation
add initial attempt at SQL chunk option immediate
Improved handling of options$immediate and additional options$replace - these edits work in a knitr::knit_engines$set( ) style test. Immediate = T passes param immediate = T to dbExecute() / dbGetQuery(). Immedate is not implemented for query's with options$max.print set. I can't be sure if options$immediate also works icw options$params - I can't get params to run at all on my server. Passing values from local environment using ?value works however, also icw the immediate option. Replace = T adds code to drop an existing temporary table before running the into # statement. I thought this might be handy but leave it to others to decide on whether this is truly needed (I also don't really know if this implementation works on other servers than the one I use). I've also included an error message to catch attempted replacement of non-temporary tables but error handling doesn't work yet as expected - when attempting to replace a table with a name not starting with #, the generic server error message appears: ([SQL Server]CREATE TABLE permission denied in database 'DWH'. ) - perhaps someone else will now how to fix, though perhaps we'll not want to implement the replace option at all. Example code: ```{r qvalue} StadiumCode <- '0A' ``` ```{sql, immediate = T, replace = T} SELECT * into #tum FROM statistiek.vw_dimStadium where StadiumCode = ?StadiumCode ``` ```{sql fetch} SELECT * FROM #tum ```
Fixed error handling on temporary table check. What a difference a space makes...
reptable = gsub('(^.into[[:space:]]+)(#.+)([[:space:]]+from.$)', '\\2', sql, ignore.case = T) |
Anne-Wil seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
Just checking (I'm really inexperienced >.<): I'm seeing this super inviting 'Update branch' button (with alternate update with rebase) - do I push that or do I just wait now? :) |
You can push that to update but that is not necessary. We'll do it ourself, or just merge the branch. Your change target in code part that has not been modified in some time, and tests are passing ok. So you can leave that way, and just wait for us to review. I added that in my TODO list. Thanks ! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really know much about SQL, so I only reviewed the coding style. BTW, the changes with pure white spaces in this PR need to be discarded (if you don't know how, we can handle this before we merge).
Thanks!
R/engine.R
Outdated
immediate= F | ||
if(isTRUE(options$immediate)) { | ||
immediate= T | ||
replace= F | ||
if(isTRUE(options$replace)) { | ||
replace= T |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
immediate= F | |
if(isTRUE(options$immediate)) { | |
immediate= T | |
replace= F | |
if(isTRUE(options$replace)) { | |
replace= T | |
immediate = isTRUE(options$immediate) | |
replace = isTRUE(options$replace) | |
if (immediate && replace) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
I made a few comments to share a few thoughts.
Currently, with this change :
- We are passing
immediate
no matter what - We are setting the default value to
FALSE
whereas it seems to be NULL in DBI which has a different meaning.
I don't really know which backend that would impact or not, but I think it would be safer to prevent this somehow using more conditional logic. I can help tweak the PR if you do not know how to do it.
Thanks
R/engine.R
Outdated
if(isTRUE(options$immediate)) { | ||
immediate= T | ||
replace= F | ||
if(isTRUE(options$replace)) { | ||
replace= T |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would use sql.
prefix on those new options, to go with existing sql.print
and sql.show_interpolated
.
@yihui I don't know if we can end up with a rule on under which condition which should use engine.opts
, bare chunk option name or prefixed option name. I think we should try being consistent. sql.show_interpolated
is not really documented so we could also support show_interpolated
to be consistent.
Probably than the chunk engine is enough to scope the variable (advanced usage of chunk options hooks would require checking the engine, and then checking the option in case of same option name used by two different engine with different meaning - may never happen though).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, let's use the sql.
prefix for the new options here.
I don't know if we can end up with a rule on under which condition which should use
engine.opts
, bare chunk option name or prefixed option name. I think we should try being consistent.
I agree. It has been a mess, which I don't like, but I feel it's tough to decide which is better:
```{sql, engine.opts = list(immediate = TRUE, replace = TRUE)}
```{sql, sql.immediate = TRUE, sql.replace = TRUE}
chunk engine is enough to scope the variable (advanced usage of chunk options hooks would require checking the engine, and then checking the option in case of same option name used by two different engine with different meaning - may never happen though).
That's also what I think.
data = tryCatch({ | ||
if (is_sql_update_query(query)) { | ||
DBI::dbExecute(conn, query) | ||
DBI::dbExecute(conn, query, immediate = immediate) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to take precaution here.
DBI::dbExecute()
is a generic and immediate
argument, is indeed among the specifications. However, immediate
is not part of the generic argument to improve compatibility with the different backends.
https://dbi.r-dbi.org/reference/dbexecute#additional-arguments-1
The following arguments are not part of the dbExecute() generic (to improve compatibility across backends) but are part of the DBI specification:
- params (default: NULL)
- immediate (default: NULL)
They must be provided as named arguments. See the "Specification" sections for details on their usage
This is valid for other calls too.
data = tryCatch({ | ||
if (is_sql_update_query(query)) { | ||
DBI::dbExecute(conn, query) | ||
DBI::dbExecute(conn, query, immediate = immediate) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to take precaution here.
DBI::dbExecute()
is a generic and immediate
argument, is indeed among the specifications. However, immediate
is not part of the generic argument to improve compatibility with the different backends.
https://dbi.r-dbi.org/reference/dbexecute#additional-arguments-1
The following arguments are not part of the dbExecute() generic (to improve compatibility across backends) but are part of the DBI specification:
- params (default: NULL)
- immediate (default: NULL)
They must be provided as named arguments. See the "Specification" sections for details on their usage
This is valid for other calls too.
R/engine.R
Outdated
@@ -608,13 +609,26 @@ eng_sql = function(options) { | |||
max.print = -1 | |||
sql = one_string(options$code) | |||
params = options$params | |||
|
|||
immediate= F |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DBI seems to default to NULL for dbExecute
and dbGetQuery
, which means
https://dbi.r-dbi.org/reference/dbexecute#specification-for-the-immediate-argument-1
The default NULL means that the backend should choose whatever API makes the most sense for the database
So setting FALSE, will change behavior for all the DB that backends that will use TRUE by default.
Maybe we should stick to that, and pass immediate = TRUE
or FALSE
when the chunk option is explicitly set ?
R/engine.R
Outdated
@@ -608,13 +609,26 @@ eng_sql = function(options) { | |||
max.print = -1 | |||
sql = one_string(options$code) | |||
params = options$params | |||
|
|||
immediate= F |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DBI seems to default to NULL for dbExecute
and dbGetQuery
, which means
https://dbi.r-dbi.org/reference/dbexecute#specification-for-the-immediate-argument-1
The default NULL means that the backend should choose whatever API makes the most sense for the database
So setting FALSE, will change behavior for all the DB that backends that will use TRUE by default.
Maybe we should stick to that, and pass immediate = TRUE
or FALSE
when the chunk option is explicitly set ?
- added prefixes sql. to immediate and replace options - set NULL as default for sql.immediate - handling of user input FALSE as well as TRUE for sql.immediate - added stop when attempting to replace (sql.replace == T) when not executing immediately (sql.immedate == F) - removed all-whitespaced lines as well as I could To be discussed: implement non-passing of parameter immediate when options$sql.immediate is not set.
Sorry for the radio silence - things where happening. Thanks for all the input! I've implemented as much as I could - see commit 358d552. A big one that is open for debate still is whether/how to make it so that paramater immediate is NOT passed to dbGetQuery when the user did not set it. I suppose that's what we want, given that it's cleanest if we also don't pass the NULL value if sql.immediate is not actively invoked. For option params the code has a split based on length(params) == 0 and separately defined calls to DBI::dbGetQuery() with and without passing of params. We could expand that into four calls (with/without immediate and/or params), but perhaps something along the lines of a do.call with a list of arguments could be preferable? I did try every trick I could think of along the lines of DBI::dbGetQuery(conn, sql, if(exists('immedate'){immedate = immediate}) but no joy :/ |
Fix regex for lifting temporary table name to replace when sql.replace =T + a correction to the getQuery call with params where `sql` was used instead of `query`.
Alas, I notice that in practice the replace option doesn't always work flawlessly due to the regex to lift the tempary table name from the query also picking up new lines ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I understand why we need the exist()
for retrieving the option. I commented about that.
We could expand that into four calls (with/without immediate and/or params), but perhaps something along the lines of a do.call with a list of arguments could be preferable? I did try every trick I could think of along the lines of DBI::dbGetQuery(conn, sql, if(exists('immedate'){immedate = immediate}) but no joy :/
After looking closer to DBI, I think the most important thing is not changing the default. So that
dbExecute(con, immediate = options$sql.immediate)
correctly pass NULL value. I don't think that methods that don't have immediate
will break because of the ...
- it will just be unused argument in this. So we may be fine passing it anyway.
Otherwise, the do.call
alternative seems like a good one, with a list of argument containing immediate
or not depending on the option. Did you try that already ?
Also, did you come up with test document for this PR ? Just curious to avoid making a new one.
Thanks
R/engine.R
Outdated
if(isTRUE(options$immediate)) { | ||
immediate= T | ||
replace= F | ||
if(isTRUE(options$replace)) { | ||
replace= T |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would use sql.
prefix on those new options, to go with existing sql.print
and sql.show_interpolated
.
@yihui I don't know if we can end up with a rule on under which condition which should use engine.opts
, bare chunk option name or prefixed option name. I think we should try being consistent. sql.show_interpolated
is not really documented so we could also support show_interpolated
to be consistent.
Probably than the chunk engine is enough to scope the variable (advanced usage of chunk options hooks would require checking the engine, and then checking the option in case of same option name used by two different engine with different meaning - may never happen though).
immediate = NULL | ||
if(exists("sql.immediate", where = options) ) { immediate = options$sql.immediate } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need exist
here ?
immediate = NULL | |
if(exists("sql.immediate", where = options) ) { immediate = options$sql.immediate } | |
immediate = options$sql.immediate |
would be enough, wouldn't it ? If sql.immediate
is not set then it will be NULL
options = list()
options$sql.immediate
#> NULL
Did I missed something ?
@@ -608,13 +608,26 @@ eng_sql = function(options) { | |||
max.print = -1 | |||
sql = one_string(options$code) | |||
params = options$params | |||
immediate = NULL | |||
if(exists("sql.immediate", where = options) ) { immediate = options$sql.immediate } | |||
if(exists('sql.replace', where = options) ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here. This is not a usual pattern. I am not sure I see why we can't retrieve the options and then check ?
immediate = options$sql.immediate
replace = options$sql.replace
if (isTRUE(replace) && isTRUE(immediate)) {
do_somethings
}
...
What did I miss ?
immediate = NULL | ||
if(exists("sql.immediate", where = options) ) { immediate = options$sql.immediate } | ||
if(exists('sql.replace', where = options) ) { | ||
if(!isTRUE(immediate)) knitr:::stop2("To replace a temprary table, option sql.immediate has to be set to TRUE).") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if(!isTRUE(immediate)) knitr:::stop2("To replace a temprary table, option sql.immediate has to be set to TRUE).") | |
if(!isTRUE(immediate)) knitr:::stop2("To replace a temporary table, option sql.immediate has to be set to TRUE).") |
As discussed here: fixes rstudio/rmarkdown#2241
I haven't added test files as these changes require access to an sql server with permission to create temporay tables. Haven't found that in an open database.
I'm excited about this! :)