Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for table "device_exposure" #172

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

AndrewThien
Copy link

@AndrewThien AndrewThien commented Sep 26, 2024

Dear maintainers of the Carrot CDM repo,

I am Thien, a software engineer from the University of Nottingham who has been working on Carrot Mapper since April 2024.

For the context of this PR, we received a feature request to expand Carrot mapper's ability to add concepts having the domain "Device". We have fulfilled this request on Carrot mapper by this PR , but we thought that it would not be enough if Carrot CDM doesn't support the conversion for the OMOP table "device_exposure" or the Mapping JSON related to "device_exposure".

About the PR itself, following the existing workflow and procedure of defining and applying an object/OMOP table, the support for "device_exposure" table has been added. Please don't mind the automatic formatting edits.

Please have a look and I am happy to answer any questions.

Thanks!

@PhilAppleby
Copy link
Collaborator

PhilAppleby commented Oct 7, 2024

Hello Thien,

Sorry for the late reply, I was out of the country until the end of last week.

In the transfrmation work we do at the University of Dundee we no longer execute the "carrot run map" code-path which uses the in-memory CDM as this cannot not handle data larger than a few thousand items. We have data sets runnning to millions of records now.

The newer code (currently on the carrotlite branch, pending separation as a new package) loads OMOP DDL dynamically and also requires certain fields to be identified in a .json file. It can also be executed from the master branch using "carrot run mapstream".

The handling of OMOP tables are now configuration items and should not require code changes for handling device_exposure - if you have a test rules file and a corresponding scan report I will be able to check this.

Thanks,
Phil

@AndyRae
Copy link
Member

AndyRae commented Oct 7, 2024

Hi @PhilAppleby - this is exciting, I'll drop you an email as we'd be keen to support the carrot-lite work, I think it fits perfectly with our SDE adjacent use cases over the coming 6 months.

@PhilAppleby
Copy link
Collaborator

Hi @AndyRae,

A little more information, this first came about due to the need to build a windows exe and my frustration with being unable to get pyinstaller to produce a running exe.

But this means that we are now able to produce both a windows exe and a lighter python-executed version for data transformation with a greatly reduced list of library dependencies.

You should be aware that this only does the Transform part of ETL - it expects extracted data and produces output for bulk database load - this is all we have used in our transformation of data sets for Alleviate, including some which are very large (millions of records).

I can supply more including a discussion on why the orignal is so slow for large datasets, I am aware that we will need to update Carrot docs for this.

Phil

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants