-
Notifications
You must be signed in to change notification settings - Fork 17
miniproject: viral epidemics and disease
Lakshmi Devi Priya edited this page Jul 5, 2020
·
34 revisions
Priya
Dheeraj
- Use the communal corpus
epidemic50noCov
articles. - Scrutinizing the 50 articles to know the true positives and false positives, that is, whether the articles are about viral epidemic or not.
- Using
ami search
to find whether the articles mentioned any comorbidity in a viral epidemic or not. - Sectioning the articles using
ami:section
to extract the relevant information on comorbidity. Annotating with dictionaries to create ami DataTables. - Refining and rerunning the query to get a corpus of 950 articles.
- Using relevant ML technique for the classification of data whether the articles are based on viral epidemic and the diseases/disorders that co-occur.
- A spreadsheet as well as a graph will be developed based on the comorbidity during a viral epidemic and their count.
- Development of the ML model for data classification on accuracy.
- Initially the communal corpus
epidemic50noCov
will be used. - Later a corpus of 950 articles will be created.
-
getpapers
to create the corpus of 950 articles. -
AMI
for creating and using dictionaries, sectioning. -
SPARQL
for creating dictionaries. -
KNIME
for workflow and analytics.
- The 50 articles in communal corpus
epidemic50noCov
were binary classified as true and false positives manually and a spreadsheet was developed. -
ami search
was used in the corpus of 50 articles and the html DataTables ondisease
dictionary waere created. - The corpus was sectioned using
ami section
using reference from https://github.com/petermr/openVirus/wiki/ami:section. -
getpapers
was used to create a corpus of950
articles regarding human viral epidemics(expect COVID-19) by the syntaxgetpapers -q "viral epidemics AND human NOT COVID NOT corona virus NOT SARS-Cov-2" -o disease_mp -f ve/log.txt -k 950 -x -p
. XML -949
files & PDF -902
files were created.