Scribe - Beautiful YouTube Transcripts with Gemini 1.5 Flash 8B LLM

This repo shows how to fetch the raw YouTube transcripts and use the Gemini Flash 8B API to format them. Made by @ldenoue

How to use

Try the demo at https://ldenoue.github.io/readabletranscripts and type any search term

Try locally python3 -m http.server and open http://localhost:8000

Set your Gemini API KEY (create a Free Gemini API Key) Note: The key is used

readabletranscripts/code.js

Line 159 in f5ea580

const genAI = new GoogleGenerativeAI(API_KEY);

to fetch Gemini's answers from your browser. It is never uploaded anywhere.

Click on the video, e.g. https://ldenoue.github.io/readabletranscripts/?id=8yzmCt0QwOQ

You will see a summary of the video and the transcript below.

How it is done

Getting vocabulary words from the title and video description

We first ask Gemini to extract important words by giving it the video title and its description. See

readabletranscripts/code.js

Line 837 in 8466ec2

    
           async function createVocabulary(videoId, description = '', languageCode = 'en') {

This context is essential to improve the accuracy of the transcripts. Titles and descriptions often contain human-edited text that includes proper names, acronyms, etc.

Breaking the transcript into chunks

We break up the raw YouTube transcript into chunks of 512 words. We feed each chunk to Gemini with a prompt (see

readabletranscripts/code.js

Line 410 in 5c51f52

async function punctuateText(c, vocab = '', lang = 'en', p = null) {

)

Notice that we send the requests in parallel to Gemini.

Merging sentences at the boundary between 2 chunks

Once we have the formatted chunks, we now need to merge them. For each 2 consecutive chunks chunk1 and chunk2, we ask Gemini to merge the last sentence chunk1 and the first sentence of chunk2 (see prompt in

readabletranscripts/code.js

Line 433 in 5c51f52

async function mergeSentences(a, b, vocab, languageCode = 'en') {

)

We merge the chunks and the seams between them to obtain the final transcript.

Linking the raw YouTube timestamps with the final transcript

In order to highlight the words as the video plays, we need to align the words from the raw YouTube transcript and the final, punctuated, transcript.

We rely on diff.js for that.

Now words get highlighted as the video plays, and users can also jump into the video by clicking any word in the transcript.

Help me improve this tool

Thanks for your contribution.

Name		Name	Last commit message	Last commit date
Latest commit History 83 Commits
.github		.github
examples		examples
LICENSE		LICENSE
README.md		README.md
code.js		code.js
diff2.js		diff2.js
favicon.png		favicon.png
favicon.svg		favicon.svg
icon-180.png		icon-180.png
index.html		index.html
localforage-getitems.js		localforage-getitems.js
localforage.min.js		localforage.min.js
site.webmanifest		site.webmanifest
styles.css		styles.css

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Scribe - Beautiful YouTube Transcripts with Gemini 1.5 Flash 8B LLM

How to use

How it is done

Getting vocabulary words from the title and video description

Breaking the transcript into chunks

Merging sentences at the boundary between 2 chunks

Linking the raw YouTube timestamps with the final transcript

Help me improve this tool

About

Releases

Sponsor this project

Packages

Languages

License

ldenoue/readabletranscripts

Folders and files

Latest commit

History

Repository files navigation

Scribe - Beautiful YouTube Transcripts with Gemini 1.5 Flash 8B LLM

How to use

How it is done

Getting vocabulary words from the title and video description

Breaking the transcript into chunks

Merging sentences at the boundary between 2 chunks

Linking the raw YouTube timestamps with the final transcript

Help me improve this tool

About

Resources

License

Stars

Watchers

Forks

Releases

Sponsor this project

Packages 0

Languages

Packages