-
Notifications
You must be signed in to change notification settings - Fork 546
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clickhouse mode of study view #11224
base: master
Are you sure you want to change the base?
Conversation
@@ -181,7 +181,7 @@ | |||
</if> | |||
</if> | |||
</where> | |||
Group by clinical_event.EVENT_TYPE, patient.STABLE_ID | |||
Group by clinical_event.EVENT_TYPE, patient.INTERNAL_ID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@haynescd this is one fix i did. i think we can keep it. the stable id is not unique across study
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed
@@ -223,6 +263,197 @@ public Pair<List<CopyNumberCountByGene>, Long> getPatientCnaGeneCounts(List<Mole | |||
); | |||
} | |||
|
|||
@Override |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@haynescd what's all this stuff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the stuff for the new clickhouse implementation
@@ -0,0 +1,32 @@ | |||
DROP TABLE IF EXISTS sample_list_columnstore; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@haynescd we can kill this file, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
@@ -121,6 +121,7 @@ | |||
<include refid="selectGenePanelData"/> | |||
<include refid="fromGenePanelData"/> | |||
WHERE | |||
SAMPLE_ID IS NOT NULL AND |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@haynescd this is one we might want to undo because it changes profile counts. this is what allows the system to recover from incomplete sample_profile table issue.
@@ -32,7 +32,7 @@ | |||
window.netlify = localStorage.netlify; | |||
|
|||
if (window.localdev || window.localdist) { | |||
window.frontendConfig.frontendUrl = "//localhost:3000/" | |||
window.frontendConfig.frontendUrl = "https://localhost:3000/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
might possible break localdb tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, probably
* Add Columnar SQL file to init Clickhouse DB * Refactored Mapper xml to extract StudyViewFilterMapper
* ✅ Add Unit test for StudyViewMapper Clickhouse * ✅ Update db props to include mysql and clickhouse datasources to fix tests * Address comments * Rename package to clickhouse * Update to static final * Use bean name instead of qualifier
* Create new wide table sql file and rename package * Remove genomic_event view * Add AlterationFilter to mutated_genes endpoint * Add AlterationFilter to mutated-genes endpoint * Fix unit test * Fix sonar issues * Add test for mutation types and status * remove unused imports
* add missing poc clinical data binning function
* Add sample_mv materialized view and use it in mappers
* Add Support for TotalProfiledCase Counts for Mutated-genes endpoint. * Create sql files to create new tables * Add unit test for totalProfiledCount * Add matching gene panel ids * Add TotalProfiledCountsWithoutPanelData * Add profileCount for genes without gene panel data * Add Comments for SQL * Update matching Gene Panel Ids * Clean up code * Fix test * Add query to get correct Gene Panels * Fix unit test * Add comments
* working poc * refactor logic into service, so clean * refactor for parameters builder, simplify min max logic, streamline service call * remove unused services and imports * remove more unused imports
* Implement molecular profile count endpoint using Clickhouse * Cleanup
* ✨ Add CNA Gene Endpoint * 🐛 Fix StudyViewFilterMapper.xml to allow ability to filter on gene and alteration * Fix merge conflict * Address comments * Fix unit tests * Fix sonar issues
* ✨ Add StructuralVariant-genes endpoint * Fix sonar issues * Update MatchingGenePanel request to return list * Create and use sample_derive * Update where sample_derived is stored to fix unit test
Co-authored-by: Bryan Lai <[email protected]>
* use clinical_data_derived instead of sample_clinical_attribute_numeric_mv and patient_clinical_attribute_numeric_mv * use clinical_attribute_meta instead of sample_clinical_attribute_numeric_mv and patient_clinical_attribute_numeric_mv * remove unused clinical data count methods and SQL * fix numericalClinicalDataCountFilter * Move CategoricalClinicalAttributeFilter to repository * remove unused columns * Add override to methods --------- Co-authored-by: haynescd <[email protected]>
…0857) * Add patient_id column to genomic_event_derived * Update sql to convert list of patients to list of samples
* refactor to use clickhouse * filter out empty attr values * edit comment * fix sonarcloud issues * use parallel stream, shaves off 5s * use newer mapping annotation
4d4b302
to
2b45019
Compare
@@ -14,7 +14,9 @@ | |||
import java.util.stream.Collectors; | |||
|
|||
@Component | |||
@ConditionalOnProperty(name = "persistence.cache_type", havingValue = "redis") | |||
@ConditionalOnExpression( | |||
"#{environment['persistence.cache_type'] == 'redis' or environment['persistence.cache_type_clickhouse'] == 'redis'}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should remove this and be enable caching or disable caching for the whole system... going forward
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed. this is so that we can assess performance without cheating with cache. we need caching on for legacy because otherwise, the product is totally unusable!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and at the same time, when we deploy for demonstration purposes, we want initial load of studyview to compete with legacy, i.e. cache ON.
@@ -19,7 +20,9 @@ | |||
import java.util.stream.Collectors; | |||
|
|||
@Service | |||
@ConditionalOnProperty(name = "persistence.cache_type", havingValue = {"redis"}) | |||
@ConditionalOnExpression( | |||
"#{environment['persistence.cache_type'] == 'redis' or environment['persistence.cache_type_clickhouse'] == 'redis'}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above... Should we still separate these?
* Hide select properties of ClinicalDataFilter from frontend * Update swagger decorators on clickhouse controller
…#11155) * Add patient level filtering for aggregation * Patient level filtering works for non-NA * Categorical patient level filtering & clean up * Use new generic assay table schema
…y clickhouse_enabled is set (#11256) * Update cBioPortal to dynamically load ch Components only when property clickhouse_enabled is set * Update env var for circleCi
* Merge genomic data bins working * Workaround for clickhouse bug in numerical data parsing --------- Co-authored-by: alisman <[email protected]>
* fix CNA query for genomic data filter * rename one of the cna_query statements to cna_count_query to avoid table name clash
Quality Gate failedFailed conditions See analysis details on SonarQube Cloud Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE |
Fix # (see https://help.github.com/en/articles/closing-issues-using-keywords)
Describe changes proposed in this pull request:
Checks
Any screenshots or GIFs?
If this is a new visual feature please add a before/after screenshot or gif
here with e.g. Giphy CAPTURE or Peek
Notify reviewers
Read our Pull request merging
policy. It can help to figure out who worked on the
file before you. Please use
git blame <filename>
to determine thatand notify them either through slack or by assigning them as a reviewer on the PR