sysrev-seed-collection

This is a collection repository for systematic review topics with seed studies. The file is separated into two sections:

collection_data: contains organised data, including basic data, search data and snowballing data.

experiments: contains the result and data usage for three experiments, including seed-driven query formulation, seed-driven document ranking, and seed driven snowballing document ranking.

For folder collection_data:

overall_collection.jsonl: with each line referring to one topic, every topic includes following attributes:

Attributes	Description
id	correpsonds to ID in paper
link_to_review	corresponds to Link to Review in paper
title	Corresponds to Title in paper
search_name	corresponds to Description in paper
Date_from	Corresponds to Date restriction in paper
Date_to	Corresponds to Date restriction in paper
query	Corresponds to PubMed query in paper
seed_studies	Corresponds to Seed studies in paper
included_studies	Corresponds to Included studies in paper
edited_search	Stating the edited search queries (with standard format)

search: a folder containing candidate_documents.res, which is a trec format file for candidate documents retrieved by systematic review boolean queries, this file corresponds to Retrieved studies in paper

snowballing: a folder containing seed_snowballing_document.tsv and screened_snowballing_documeent.tsv, these two files are for snowballing candidate document, every line corresponds to one topic, topic and documents' list is separated by \t, documents' ids are separated by '|', these two files correspond to Snowballed studies in paper

For folder experiment:

Three experiments in the paper are included in this folder, including:

query_formulation: Automatic query formulation experiment.
sdr_document_ranking: SDR-driven document ranking.
sdr_snowballing: SDR-driven snowballing document ranking.

Instructions on how to run these experiments are inside each experiment folder.

For sample data extraction processing, please run:

python3 sample_data_processing.py

This is a sample data extraction work, with an input of topic id; the python script will output all the information for this topic.

Please refer to the paper "From Little Things Big Things Grow: A Collection with Seed Studies for Medical Systematic Review Literature Search"

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
collection_data		collection_data
corpus		corpus
experiments		experiments
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
sample_data_processing.py		sample_data_processing.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

sysrev-seed-collection

For folder collection_data:

For folder experiment:

About

Releases

Packages

Languages

License

ielab/sysrev-seed-collection

Folders and files

Latest commit

History

Repository files navigation

sysrev-seed-collection

For folder collection_data:

For folder experiment:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages