-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/ml pipeline #29
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
nmenezes0
force-pushed
the
feature/ml-pipeline
branch
2 times, most recently
from
March 20, 2024 17:38
5bdbbfb
to
2c8f09f
Compare
nmenezes0
force-pushed
the
feature/ml-pipeline
branch
2 times, most recently
from
March 25, 2024 13:24
f9626bb
to
b37e273
Compare
nmenezes0
force-pushed
the
feature/ml-pipeline
branch
from
March 29, 2024 20:12
139e02b
to
58f641b
Compare
nmenezes0
force-pushed
the
feature/ml-pipeline
branch
from
April 2, 2024 09:13
58f641b
to
4f4b443
Compare
gecBurton
reviewed
Apr 2, 2024
gecBurton
reviewed
Apr 2, 2024
nmenezes0
force-pushed
the
feature/ml-pipeline
branch
2 times, most recently
from
April 4, 2024 11:14
977ef03
to
84124e7
Compare
rachaelcodes
reviewed
Apr 5, 2024
rachaelcodes
reviewed
Apr 5, 2024
rachaelcodes
reviewed
Apr 5, 2024
rachaelcodes
approved these changes
Apr 5, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've a couple of clarifying questions, but otherwise LGTM
nmenezes0
force-pushed
the
feature/ml-pipeline
branch
from
April 5, 2024 13:08
fdd0ad5
to
1f15c6c
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Context
Add the ML pipeline to generate the themes/topics and save them to the DB. For each question in a consultation, we need to classify the free text responses for each question into topics (also called "themes") and save these info to the DB.
Out of scope for this PR:
Changes proposed in this pull request
ML pipeline overview for a given question (with free text that needs to be classified):
Follows the first part of this: https://github.com/i-dot-ai/ova-consultation/blob/main/run_full_analysis_pipeline.py (not the LLM summary bit). BERTopic docs are pretty good too: https://maartengr.github.io/BERTopic/.
Guidance to review
Check the pipeline runs and things are saved to the DB as appropriate - this is covered in the tests (but check you are happy the tests check the entire pipeline runs, and checks themes are saved to DB).
Check the ML pipeline gets topics as per Jonah's code: https://github.com/i-dot-ai/ova-consultation/blob/main/run_full_analysis_pipeline.py
Link to JIRA ticket
https://technologyprogramme.atlassian.net/browse/CON-47
Things to check
.env.example
and.env.test
files in the repo