Identify "technical" metrics about evaluated NLP Tools that we can capture #4

tschaffter · 2020-10-11T15:03:22Z

@thomasyu888 @gkowalski In the current design of the infrastructure, the Orchestrator gets a page (as in "paginated response") of clinical notes from a Data Node (e.g. 50 clinical notes), sends them to NLP Tool being evaluated, receives the results and repeat with the next page of clinical notes. In addition of allowing us to controller the flow of information to the NLP Tool, which limit its memory need, we can evaluate and ideally report the following metrics:

Completion rate: number of notes processed / number of notes in the dataset
Time required to process a clinical note (average, std)
- the timer start after the request has been sent (clinical notes sent to the NLP Tool)
- the timer stops when all the responses have been received from the NLP Tool for the clinical notes sent

The motivation for reporting the completion rate to the user is that it will allow him/her to better predict when the results are out. This can also be used by the user to track whether the tool takes too my time to complete. For the staff maintaining a Data Hosting Site, it would be nice to have a report in ELK that shows the Tools that are being evaluated and their completion rate.

The motivation for reporting information about the processing time is that a hospital who is looking for a tool to use in production by visiting a Leaderboard of the NLP Sandbox may identify that a Tool would take too much time to process their volume of clinical note. One option could be to extrapolate and show the time required to process 1 million of notes. It's important that any time information are very much dependent on the spec of the infrastructure used (number and frequency of CPU cores, etc.). We should be able to provide information about the spec used when reporting a time information. Note that this spec may vary from one Data Hosting Site, in which case we would probably want to report the time for each dataset / Data Hosting Site used to evaluate a NLP Tool.

thomasyu888 · 2020-10-11T17:04:24Z

One important distinction to make is that currently the orchestrator doesn't do any of those things. The workflow that you see in this repository would be in charge of doing those things and the orchestrator is responsible for connecting participant submissions with this workflow.

One of my biggest concerns is that there isn't an "elegant" way with CWL to

Get 50 nodes
Process 50 nodes
annotate with metrics for those 50
Repeat step 1 until finished

Currently the workflow would be

Get a million nodes but split into chunks of 50
Process the chunks of 50 In parallel and annotate metrics.

I think the above metrics are obtainable, will just take an example submission to figure out what is and isn't possible.

tschaffter · 2020-10-11T17:44:54Z

Step 3 would be out of the loop: we process all the clinical notes and then we evaluate the performance. The loop 1-2 could be implemented in the NLP Sandbox Client as one command.

tschaffter added the help wanted Extra attention is needed label Oct 11, 2020

tschaffter assigned tschaffter and thomasyu888 and unassigned tschaffter and thomasyu888 Oct 11, 2020

tschaffter assigned tschaffter and thomasyu888 Oct 12, 2020

tschaffter removed the help wanted Extra attention is needed label Jan 7, 2021

tschaffter added the Priority: Low label Jan 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Identify "technical" metrics about evaluated NLP Tools that we can capture #4

Identify "technical" metrics about evaluated NLP Tools that we can capture #4

tschaffter commented Oct 11, 2020 •

edited

Loading

thomasyu888 commented Oct 11, 2020 •

edited

Loading

tschaffter commented Oct 11, 2020

Identify "technical" metrics about evaluated NLP Tools that we can capture #4

Identify "technical" metrics about evaluated NLP Tools that we can capture #4

Comments

tschaffter commented Oct 11, 2020 • edited Loading

thomasyu888 commented Oct 11, 2020 • edited Loading

tschaffter commented Oct 11, 2020

tschaffter commented Oct 11, 2020 •

edited

Loading

thomasyu888 commented Oct 11, 2020 •

edited

Loading