The RadRevise dataset will be available on PhysioNet through an open credential process.
Using GPT-4 to generate instructions and modified reports based on specified types of instructions and clinical topics. Note that the results will differ from RadRevise both due to GPT generated responses and the additional human review and annotation process that RadRevise has undergone.
cd generation
python generate.py
The code can be used directly to evaluate any text-generation models hosted on Hugging Face.
- Download the RadRevise dataset.
- Navigate to the
evaluation
directory. - Run the following command to evaluate a single model:
python eval_model $MODEL_ID [$DATA_PATH] [$BATCH_SIZE] [$OUTPUT_FILE]
$MODEL_ID
: the Hugging Face model id$DATA_PATH
: path to RadRevise dataset (default:../data/RadRevise_v0.csv
)$BATCH_SIZE
: the inference batch size (default: 32)$OUTPUT_FILE
: the name of the evaluation output (default:output/result.csv
)
- Alternatively, modify and execute the
run.sh
script to evaluate one or more models.
This repository is made publicly available under the MIT License.