Start collecting start and end timestamps of transcript segments #6

EricTendian · 2023-06-03T04:57:36Z

Whisper provides a start and end time for each transcript segment, for the purpose of making accurate subtitles. However, this data can also be used to finetune a Whisper model, in conjunction with corrected transcripts.

In order to make this happen, we need to first start collecting start and end timestamps for all transcript segments, and ensure the raw transcript data stores these segments. This involves modifying how we use the result dicts we get back from Whisper, and updating the transcript data structure in various places.

EricTendian added the enhancement New feature or request label Jun 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Start collecting start and end timestamps of transcript segments #6

Start collecting start and end timestamps of transcript segments #6

EricTendian commented Jun 3, 2023

Start collecting start and end timestamps of transcript segments #6

Start collecting start and end timestamps of transcript segments #6

Comments

EricTendian commented Jun 3, 2023