-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transcription structure #74
Comments
Agreed. I'll go ahead and confirm that we've landed on those 4 speaker roles. |
Great. I can make the spreadsheet pretty quick I think |
@ayyubibrahimi does this make sense to you? https://docs.google.com/spreadsheets/d/1bJRjfZtKvzgdLgjDg4QN8OPVQjmjIUQj2PY8fK6ia8o/edit?usp=sharing |
Makes sense in the long term. I think that we need to ensure we can reliably map the text that Caitlin transcribes to the data that has timestamps before finalizing a schema. |
Ya, I'm not clear how you are thinking about doing that. |
Simple in theory. Planning to begin experimenting soon. Brief overview: Example of a chunk of data that contains timestamps:
Example of how I think the transcribed text should be formatted:
Because we're currently chunking data on a roughly 5 second interlude, the amount of tokens within a chunk should be relatively consistent. If we chunk the transcribed text similarly, we should be able to perform a simple string similarity search to match the transcribed text with the timestamps. |
I'm not convinced. Wouldn't caitlin's transcription need to be almost perfectly lined up with youtube one for this to work? I imagine the two will start to drift pretty fast. And if Caitlin needs to track 5 sec increments, why not just have her track her own timestamps? Maybe 5 sec is easier than for every soundbyte. |
I don't think drifting is an issue if the string similarity match has pointers to the preceding and following chunks. Alternatively, she she can transcribe in increments of 60 seconds, for the sake of efficiency, and for the purposes of matching the strings, we can chunk the timestamp data on a 60 second interlude. These chunks can always be preprocessed further before they're read into the model. |
Proposal for how to structure transcriptions:
In order to transcribe a city council video from youtube, the transcriber should:
@ayyubibrahimi what tool do you think transcription should be done through? Google Sheets would be the easiest.
The text was updated successfully, but these errors were encountered: