Skip to content

sravanareddy/deciphervoynich

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Data files and scripts that we used for the analysis of the Voynich manuscript in Reddy and Knight (2011).

Data

All data files are in the data/ directory.

Voynich text

English

First 28551 words from the WSJ Penn Treebank, with each line an article. Devoweled version removes aeiou, but not y.

Arabic

First 19327 words from the Quran in Buckwalter transcription, without vowels. Each line is a verse.

Chinese

First 18791 words from the Sinica treebank that's included with NLTK (sinica.wds). Pinyin conversion (pinyin.wds) was done by looking up characters in the CJK library. Please let us know if you find errors in the conversion.

About

Analyzing the Voynich manuscript text

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages