Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OHDSI WebAPI JDBC Driver Connectivity Issues With DATABRICKS #2364

Closed
cyongg opened this issue Apr 23, 2024 · 9 comments
Closed

OHDSI WebAPI JDBC Driver Connectivity Issues With DATABRICKS #2364

cyongg opened this issue Apr 23, 2024 · 9 comments

Comments

@cyongg
Copy link

cyongg commented Apr 23, 2024

Goal : Trying to connect OHDSI Atlas with OMOP CDM in Databricks with Tomcat.

Caused by: org.flywaydb.core.internal.dbsupport.FlywaySqlException:
Unable to obtain Jdbc connection from DataSource

SQL State : 08S01
Error Code : 500593
Message : [Databricks]JDBCDriver Communication link failure. Failed to connect to server. Reason: HTTP Response code: 401, Error message: Unknown.

Process to recreate problem

Step 1 this is the SETTING.XML

image
image

Step 2 We have included the databricks dependency in the POM.XML

image

Step 3 We have copied and placed DatabricksJDBC42.jar file in the classpath.
(C:\Program Files\Apache Software Foundation\Tomcat 8.5\webapps\WebAPI\WEB-INF\lib)

JDBC url (From Databricks cluster) >

jdbc:databricks://{{workspace id}:{workspace port}/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/5269192184046240/0404-024850-qo6kj1mb;AuthMech=3;UseNativeQuery=1;ConnCatalog={default catalog};UID={UID};PWD={token}

@chrisknoll
Copy link
Collaborator

Databricks is not a supported DB for hosting the WebAPI database. Only postgres is supported.

@cyongg
Copy link
Author

cyongg commented Apr 23, 2024

Databricks is not a supported DB for hosting the WebAPI database. Only postgres is supported.

Im not hosting webapi database on Databricks but Im hosting my OMOP CDM on databricks, our WebAPI database is on postgres so what should I do to be able to make the WebAPI work?

@anthonysena
Copy link
Collaborator

anthonysena commented Apr 23, 2024

To enable databricks, you can compile the .war with the Databricks per: #2262 (comment)

@chrisknoll
Copy link
Collaborator

You modified the postgres-sql profile to change the datasource URL to be databricks, which tells webAPI where to find the WebAPI database. You should leave this alone and follow the setup instructions to just specify the url to your own Postgres instance in your environment.

To point to a databricks CDM, you follow the instructions for setting up a source, using the dialect of 'databricks' and the JDBC url for your databricks server. As an admin user, you can use the configuration nav to add and remove sources.

@chanjunkai11
Copy link

You modified the postgres-sql profile to change the datasource URL to be databricks, which tells webAPI where to find the WebAPI database. You should leave this alone and follow the setup instructions to just specify the url to your own Postgres instance in your environment.

To point to a databricks CDM, you follow the instructions for setting up a source, using the dialect of 'databricks' and the JDBC url for your databricks server. As an admin user, you can use the configuration nav to add and remove sources.

I thought the datasource.url suppose to be the jdbc url of databricks as datasource isn't it suppose to refer to where the CDM is located?

@chrisknoll
Copy link
Collaborator

No. CDMs are referenced in the Source table (please refer to indicated documentation above). 1 WebAPI can reference multiple CDMs which you can add/remove over time, so it doesn't make sense that we'd store it in a configuration file. It's application data.

@chanjunkai11
Copy link

does that mean I'm only required to use the sql below for the source and source_daemon table?

INSERT INTO webapi.source (source_id, source_name, source_key, source_connection, source_dialect) 
SELECT nextval('webapi.source_sequence'), 'databricks', 'databricks', 'jdbc:spark://<workspace>:443/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/5269192184046240/0404-024850-qo6kj1mb;UseNativeQuery=1;AuthMech=3;UID=token;PWD=<token>', 'spark';

INSERT INTO webapi.source_daimon (source_daimon_id, source_id, daimon_type, table_qualifier, priority) 
SELECT nextval('webapi.source_daimon_sequence'), source_id, 0, 'silver', 0
FROM webapi.source
WHERE source_key = 'databricks'
;

INSERT INTO webapi.source_daimon (source_daimon_id, source_id, daimon_type, table_qualifier, priority) 
SELECT nextval('webapi.source_daimon_sequence'), source_id, 1, 'silver', 1
FROM webapi.source
WHERE source_key = 'databricks'
;

INSERT INTO webapi.source_daimon (source_daimon_id, source_id, daimon_type, table_qualifier, priority) 
SELECT nextval('webapi.source_daimon_sequence'), source_id, 2, 'resultss', 1
FROM webapi.source
WHERE source_key = 'databricks'
;

INSERT INTO webapi.source_daimon (source_daimon_id, source_id, daimon_type, table_qualifier, priority) 
SELECT nextval('webapi.source_daimon_sequence'), source_id, 5, 'temp', 0
FROM webapi.source
WHERE source_key = 'databricks'
;

@chanjunkai11
Copy link

does that mean I'm only required to use the sql below for the source and source_daemon table?

INSERT INTO webapi.source (source_id, source_name, source_key, source_connection, source_dialect) 
SELECT nextval('webapi.source_sequence'), 'databricks', 'databricks', 'jdbc:spark://<workspace>:443/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/5269192184046240/0404-024850-qo6kj1mb;UseNativeQuery=1;AuthMech=3;UID=token;PWD=<token>', 'spark';

INSERT INTO webapi.source_daimon (source_daimon_id, source_id, daimon_type, table_qualifier, priority) 
SELECT nextval('webapi.source_daimon_sequence'), source_id, 0, 'silver', 0
FROM webapi.source
WHERE source_key = 'databricks'
;

INSERT INTO webapi.source_daimon (source_daimon_id, source_id, daimon_type, table_qualifier, priority) 
SELECT nextval('webapi.source_daimon_sequence'), source_id, 1, 'silver', 1
FROM webapi.source
WHERE source_key = 'databricks'
;

INSERT INTO webapi.source_daimon (source_daimon_id, source_id, daimon_type, table_qualifier, priority) 
SELECT nextval('webapi.source_daimon_sequence'), source_id, 2, 'resultss', 1
FROM webapi.source
WHERE source_key = 'databricks'
;

INSERT INTO webapi.source_daimon (source_daimon_id, source_id, daimon_type, table_qualifier, priority) 
SELECT nextval('webapi.source_daimon_sequence'), source_id, 5, 'temp', 0
FROM webapi.source
WHERE source_key = 'databricks'
;

image

Hmm but why is the configuration here like this it can't seems to identify my vocab version

In databricks I my cdm catalog name is omop540 and the schema name is silver so how do I connect to it?

@anthonysena
Copy link
Collaborator

@chanjunkai11 - your setup appears correct. Is your vocabulary table populated in your CDM? That is where the vocabulary version is obtained and displayed in ATLAS.

Since this issue is a bit old, I'm hoping you were able to figure this out but please re-open if you are still facing difficulties.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants