A Chrome extension for learners of the Chinese language. Provides tone color coding, a pop-up dictionary, word definitions, example sentences, useful links, and AI analysis on any sentence on any web page. Also integrated with Anki to support one-click flashcard creation for new words, sentences, or pronunciations.
demo.mov
- Color codes characters by tone, with tokenization via the web platform's
Intl
library. Available on-demand on any site. - Provides a popup dictionary for easy lookups of unknown words, using CEDICT and the popover API.
- To provide more information about a word or sentence, provides a side panel with:
- More details on definitions
- Example sentences from Tatoeba
- Useful links to Forvo, Youglish, HanziGraph, and more.
- Optionally allows for sentence and word analysis with OpenAI API integration (bring-your-own-key).
- AnkiConnect integration: just one click to add a flash card for definitions, example sentences, or audio cards to improve listening comprehension.
Anki connect integration is supported. Setup is described on their site.
Only the default settings are supported for now (no API key requirement, their standard URL, etc.).
The user's list of decks are read and words, sentences, and pronunciations can be added to any of them.
All cards created by the extension are tagged for easy management. To query them, simply run:
tag:ChineseLearningExtension
in Anki's card browser.
Better card formatting, and more sophistication with other media types and AI output, is a potential future work item.
The extension can optionally integrate with OpenAI's API to get word analysis, sentence analysis, and pronunciation.
Although I prefer human-generated content whenever possible, AI provides more flexibility, and its grammar and meaning analysis is often helpful.
Structured outputs are used to ensure nice formatting and coverage of certain topics,
though the prompts will continue to be refined. gpt-4o
is used for text output, and tts-1
with the nova
voice for TTS.
A prior version of the code had Forvo API integration, again with a bring-your-own-key model, such that human pronunciation could be optionally retrieved and saved in flashcards. Unfortunately, they no longer return CORS headers.
A future work item could be to query a local proxy instead of their API directly. For now, AI voices have replaced it.
Can be used to aid learning on sites like baike.baidu.com:
AI can provide word-by-word sentence breakdowns, discuss grammar, give usage clues, and more.
extension-ai-demo.mov
Taken from a recent reddit post; tone colors help:
One click to get audio into anki (same with definitions and example sentences):
For those who prefer light mode:
Sentence and definition data was pulled from:
- CEDICT, which releases data under CC BY-SA 4.0. Because of sharealike, the definitions files should be considered released under that license as well.
- Tatoeba, which releases data under CC-BY 2.0 FR
- Loading spinners: https://loading.io/css/