This code package implements ProtoPMed-EEG from the paper "Improving Clinician Performance in Classification of EEG Patterns on the Ictal-Interictal-Injury Continuum using Interpretable Machine Learning" by Alina Jade Barnett* (Duke University), Zhicheng Guo* (Duke University), Jin Jing* (Harvard University), Wendong Ge (Harvard University), Brandon Westover (Harvard University), and Cynthia Rudin (Duke University) (* denotes equal contribution).
This code package was SOLELY developed by the authors at Duke University and Harvard University.
EEG pattern classification performance of the users with and without AI. All users performed significantly better (p<0.05) while provided with AI assistance.
Three explanation modes offered by our model.
Our model performs significantly better than the baseline in terms of AUROC scores, AUPRC scores, neighborhood analysis by majority vote and neighborhood analysis by annotator vote distributions.
Prerequisites: PyTorch version 1.10.2 Recommended hardware: 1 to 2 NVIDIA Tesla P-100 GPUs, or 2 NVIDIA Tesla K-80 GPUs, or 1 to 2 NVIDIA 2080RTX GPUs
Download the model folder (including, model weight, prototype data and prototype json file) here.
Note: the .sh files refernced here have been set up to work with a slurm batch submission system. If you do not have a slurm batch submission system, you can run them by typing "source filename.sh" in the command line. You will need to request for data access.
Instructions for preparing the data:
- The data processing code exists in dataHelper.py
- Use the function multiprocess_preprocess_signals three times to create the directories that will become the training, test and push directories.
Instructions for running the model on a test set:
- Run run_generate_csv.sh. This runs local_analysis_v2.py and generate_csv.py.
Instructions for finding the nearest prototypes to a test sample:
- Look at the run_generate_csv.sh.
Instructions for training the model:
- In settings.py, provide the appropriate strings for data_path, train_dir, test_dir, train_push_dir: (1) data_path is where the datasets reside (2) train_dir is the directory containing the training set (3) test_dir is the directory containing the test set (4) train_push_dir is the directory of EEG samples that are eligible to become prototypes
- Run run.sh
Instructions for finding the nearest samples to each prototype:
- Run run_global_analysis_annoy.sh