Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue running getDbCovariateData on snowflake #236

Closed
mdlavallee92 opened this issue Feb 28, 2024 · 6 comments
Closed

Issue running getDbCovariateData on snowflake #236

mdlavallee92 opened this issue Feb 28, 2024 · 6 comments
Assignees
Milestone

Comments

@mdlavallee92
Copy link
Contributor

Trying to run FeatureExtraction on snowflake which requires a temp emulation schema. Currently the function only allows for oracleTempSchema, which should be updated to tempEmulationSchema so it is not db exclusive. This is what I used below

#run FE
cov <- FeatureExtraction::getDbCovariateData( 
  connection = con,
  oracleTempSchema = tempEmulationSchema,
  cdmDatabaseSchema = cdmDatabaseSchema,
  cohortTable = cohortTable,
  cohortDatabaseSchema = cohortDatabaseSchema,
  cohortId = targetCohortIds,
  covariateSettings = covSettings,
  aggregated = TRUE
)

for which I get this error:

image

Not sure if others have got this issue or if I am setting this up wrong.

@anthonysena
Copy link
Collaborator

@mdlavallee92 - I don't have access to Snowflake so this will be hard to troubleshoot.

That said, we can add a proper tempEmulationSchema parameter and deprecate the oracleTempSchema so that this is handled consistently in the package.

Note of how that might look from DatabaseConnector docs: https://ohdsi.github.io/DatabaseConnector/reference/renderTranslateExecuteSql.html

@anthonysena
Copy link
Collaborator

Linking to #159 since we may want to do this as a major, breaking change.

@ginberg
Copy link
Collaborator

ginberg commented Sep 27, 2024

@mdlavallee92 since version 3.7.0 you can pass the tempEmulationSchema in the getDbCovariateData function. Can you try if this works for you on Snowflake?

@katy-sadowski
Copy link

There is still an issue in 3.7.0. While tempEmulationSchema is passed in here, I'm not sure it's making its way all the way down into the java query builders.

> covariateData2 <- getDbCovariateData(
+   connectionDetails = connectionDetails,
+   cdmDatabaseSchema = executionSettings$cdmDatabaseSchema,
+   cohortDatabaseSchema = executionSettings$workDatabaseSchema,
+   cohortTable = executionSettings$cohortTable,
+   cohortIds = c(99994, 99995),
+   covariateSettings = covariateSettings,
+   aggregated = TRUE,
+   tempEmulationSchema = executionSettings$workDatabaseSchema
+ )
Connecting using Snowflake driver
Constructing features on server
  |================================================================================================================================================================| 100%
Executing SQL took 1.15 mins
Fetching data from server
Error in `.createErrorReport()`:
! Error executing SQL:
net.snowflake.client.jdbc.SnowflakeSQLException: SQL compilation error:
Object 'OXENIM4HCOV_1' does not exist or not authorized.
An error report has been created at  /home/sadowskk/o428/errorReportSql.txt
Run `rlang::last_trace()` to see where the error occurred.

And here is the error report - it got quite far along, so I think that the table was successfully created in the tempEmulationSchema, but then a later query is looking for it in the public schema.

DBMS:
snowflake

Error:
net.snowflake.client.jdbc.SnowflakeSQLException: SQL compilation error:
Object 'OXENIM4HCOV_1' does not exist or not authorized.

SQL:
SELECT cohort_definition_id, covariate_id, count_value, min_value, max_value, average_value, standard_deviation, median_value, p10_value, p25_value, p75_value, p90_value
FROM (
SELECT cohort_definition_id, covariate_id, count_value, min_value, max_value, average_value, standard_deviation, median_value, p10_value, p25_value, p75_value, p90_value FROM oxenim4hcov_1 UNION ALL
SELECT cohort_definition_id, covariate_id, count_value, min_value, max_value, average_value, standard_deviation, median_value, p10_value, p25_value, p75_value, p90_value FROM oxenim4hcov_2
) all_covariates;

@mdlavallee92
Copy link
Contributor Author

Hey @ginberg and @katy-sadowski I believe the bug is on a line in the getDbDefaultCovariateData function when trying to extract continuous features. See here.

When running FE v3.7.1 on snowflake it works with the tempEmulationSchema setting if my covSettings are binary. Believe Katy has a continuous setting for covariates leading to the error.

@ginberg
Copy link
Collaborator

ginberg commented Oct 11, 2024

hi @mdlavallee92 you are right we should use tempEmulationSchema here, thanks for the catch!
@katy-sadowski I have created a branch with the updated code, could you try it out?

@ginberg ginberg self-assigned this Oct 15, 2024
@ginberg ginberg added this to the v3.7.2 milestone Oct 15, 2024
@ginberg ginberg closed this as completed Oct 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants