Convert educational videos into equivalent written materials.
Based on the challenge from Andrej Karpathy:
Fun LLM challenge that I'm thinking about: take my 2h13m tokenizer video and translate the video into the format of a book chapter (or a blog post) on tokenization.
Something like:
- Whisper the video
- Chop up into segments of aligned images and text
- Prompt engineer an LLM to translate piece by piece
- Export as a page, with links citing parts of original video
More generally, a workflow like this could be applied to any input video and auto-generate "companion guides" for various tutorials in a more readable, skimmable, searchable format. Feels tractable but non-trivial.
This Wordware prompt takes in the JSON output from running Whisper (I used this one on Replicate) and processes it into sections of a written lesson.
The simple script in scripts/convert.py
turns the output into a set of Markdown files that are then served via GitHub
pages.
Here is the output from running Whisper on the audio track of Andrej's Tokenizer video.
Alternatively you can run youtube-dl
on the video e.g.
youtube-dl --extract-audio --audio-format mp3 "https://www.youtube.com/watch?v=<video_id>"
then run it through a transcription model like this one.