Voynich Manuscript — Deciphered

Computational structural analysis of the Voynich manuscript (Beinecke MS 408) revealing systematic grammar, case morphology, and consistent clause structure across all five manuscript sections.

Doctor's Personal Alphabet

The profile points to a Dravidian-speaking Siddha medical practitioner who designed a personal script to encode their pharmacopeia. The positional rules are too systematic for a naturally evolved writing system — this script was engineered, not inherited. Early folios show rougher, less consistent glyph forms; later sections are fluid and assured. The scribe was learning their own writing system as they wrote.

No second copy exists. No parallel text. No Rosetta Stone. This was a private notebook — never meant for anyone else to read.

Reasonable Confusion

The Voynich manuscript's statistics simultaneously mimic several different systems, and each metric points in a different direction:

High Index of Coincidence (0.075) made it look like a simple substitution cipher of Latin or Italian — but the bigram coverage was far too low (7%) for any European alphabet.
Low bigram coverage suggested a large syllabary (~140 symbols) — but the IC was 5x too high for a syllabary, which would spread frequency across all glyphs.
Narrow word-length distribution (CV = 0.386) matched Latin syllable statistics perfectly — but the positional dominance (0.839) was higher than any natural language alphabet.
Extreme positional constraints (some characters 99%+ word-initial or word-final) looked like a constructed system — but the paradigm fill rate (15%) and suffix Zipf exponent (0.893) were exactly in natural language range.

The resolution: it's an abugida — a small set of consonant bases (~14) combine with vowel modifiers to produce ~50 surface glyphs. This gives you a small effective alphabet (high IC) with many surface forms (low bigram fill) and strong positional rules (onset/nucleus/coda occupy distinct character slots).

The language itself is agglutinative with SOV word order, which produces the regular word lengths and clause-final verb patterns. The two noun classifiers (h- organic, k- material) act as scribal semantic markers, not grammatical gender — 64% of nouns carry neither.

Telugu positional abugida encoding is the closest statistical match across all metrics (composite distance 0.066).

Telugu nearly matches the Voynich on all four key metrics. English and Latin syllabary encodings fail on IC.

Dravidian languages match 94% of the Voynich's typological features — more than any other language family tested.

No single language + simple syllabary reproduces all four metrics. The IC gap (bottom-right) is what rules out a pure syllabary and points to an abugida with a small base alphabet.

What We Found (Verifiable)

The EVA transcription encodes a consistent agglutinative grammar:

6 case suffixes with distinct verb selectional preferences (-an: 38% before verb 1H; -am: 33% before verb 1cH; -ae: 34% before verb 1K)
SOV word order — 76.5% of clause-final words end in suffix -9
Two noun classifiers (h- and k- prefixes) — NOT grammatical gender: 64% of nouns are unmarked, 54 roots appear in both classes, verbs don't agree
Participle chaining for sequential procedures (-c89 = "having done")
Definite article 4o- (proclitic, 97% character binding)
Clause-final demonstratives (sam, san, sae) with case agreement

This grammar is internally consistent at 80% parseability across all 5 sections (biological, herbal A, herbal B, astronomical, recipe/stars) and 29,000 words.

Case suffix distribution across manuscript sections — each case has distinct frequency profiles matching its grammatical function.

76.5% of clause-final words carry the finite verb suffix -9, confirming SOV order.

Noun root frequency varies systematically by section, consistent with a medical handbook covering different domains.

h/k noun classifiers are scribal semantic markers, not grammatical gender — 64% of nouns are unmarked.

To verify: run translate_voynich.py on the standard EVA transcription. The grammar rules are encoded in the script. The parseability percentage is reproducible.

What We Propose (Preliminary)

Building on the grammar, further analysis suggests:

Script type: positional syllabary/abugida with ~50 distinct glyphs (EVA collapses to ~30, destroying phonetic information)
Language family: closest statistical match is Dravidian (Telugu positional abugida encoding, composite distance 0.066 across 6 metrics)
Content: Siddha medical handbook — pharmacopeia, anatomy, medical astrology, and compounding procedures
23 preliminary glyph-to-syllable mappings from 9 plant name readings
Most common content word may read as "amma" (body/being) — a proto-Dravidian root

Plant name cross-references yield consistent syllable mappings across 9 plants.

Five mutual-exclusion character groups — characters within a group never appear adjacent, competing for the same structural slot. The signature of a featural script.

These proposals need independent verification, particularly by a Dravidian linguist working from original manuscript glyphs rather than EVA transcription.

Translation Coverage

The structural translation resolves 81% of words to English glosses. The remaining 19% are left as bracketed placeholders [...] and fall into three categories:

Uppercase EVA variants — visually distinct glyphs for which we lack phonetic values
Special characters — unusual glyphs outside the standard EVA alphabet
Rare roots — insufficient distributional data to constrain meaning

The 81% that is translated is backed by distributional evidence. The 19% that remains bracketed is honestly unknown.

Files

Core:

VOYNICH_TRANSLATION.txt — complete structural translation (4634 lines)
voynich_lexicon.txt — lexicon with syntactic frames (9217 entries)
voynich_syllabary.txt — 23 glyph-to-syllable mappings with evidence
voynich_semantic_map.txt — root meanings with confidence levels
voynich_clause_structure.txt — case system, verb forms, clause templates
translate_voynich.py — translation engine (reproducible)

Evidence:

METHODS.md — complete methodology, confidence levels, known limitations
voynich_glyph_inventory.txt — visual variant catalog from hi-res images
voynich_plant_identifications.txt — 20 plants with Dravidian names
voynich_unified_findings.txt — 80 consolidated findings
voynich_*_report.txt — individual analysis reports

Related Work

Signals & Noise

Zero-knowledge semantic topology extraction via dual grammar induction. Runs two complementary compression algorithms (Sequitur for structure, MR-RePair for frequency) on raw bytes, overlays the rulesets into a 2x2 matrix, and discovers relational structure from the residuals. The core principle — let structure emerge from the data without assumptions, then classify what the algorithms agree on vs. disagree on — is the same approach applied here to the Voynich manuscript's morphological patterns.

Character Energy Analysis

Biomechanical analysis of writing systems — modeling glyphs as physical stroke paths with two-axis motor cost, curvature, pen lifts, and transition angles. One useful idea from that work: characters in a writing system exist in a constrained energy landscape where positional patterns and transition costs encode structural information about the script. That perspective informed parts of the Voynich script analysis, particularly the mutual-exclusion character groups and positional dominance patterns.

Status

Exploratory. Not peer-reviewed. The grammar is internally consistent and reproducible. The phonetic decryption and language identification are preliminary and need independent verification.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
figures		figures
.gitignore		.gitignore
METHODS.md		METHODS.md
README.md		README.md
VOYNICH_TRANSLATION.txt		VOYNICH_TRANSLATION.txt
translate_voynich.py		translate_voynich.py
voynich_basque_test_report.txt		voynich_basque_test_report.txt
voynich_clause_structure.txt		voynich_clause_structure.txt
voynich_cryptanalysis_report.txt		voynich_cryptanalysis_report.txt
voynich_dravidian_plants_report.txt		voynich_dravidian_plants_report.txt
voynich_glyph_inventory.txt		voynich_glyph_inventory.txt
voynich_grammar_report.txt		voynich_grammar_report.txt
voynich_grammar_validation_report.txt		voynich_grammar_validation_report.txt
voynich_language_hunt_report.txt		voynich_language_hunt_report.txt
voynich_lexicon.txt		voynich_lexicon.txt
voynich_meaning_report.txt		voynich_meaning_report.txt
voynich_morphology_report.txt		voynich_morphology_report.txt
voynich_phonetic_map.txt		voynich_phonetic_map.txt
voynich_plant_identification_report.txt		voynich_plant_identification_report.txt
voynich_plant_identifications.txt		voynich_plant_identifications.txt
voynich_prosodic_split_report.txt		voynich_prosodic_split_report.txt
voynich_prosody_report.txt		voynich_prosody_report.txt
voynich_rhythm_decode_report.txt		voynich_rhythm_decode_report.txt
voynich_semantic_map.txt		voynich_semantic_map.txt
voynich_spectral_comparison_report.txt		voynich_spectral_comparison_report.txt
voynich_syllabary.txt		voynich_syllabary.txt
voynich_syllabary_test_report.txt		voynich_syllabary_test_report.txt
voynich_tamil_test_report.txt		voynich_tamil_test_report.txt
voynich_turkic_comparison_report.txt		voynich_turkic_comparison_report.txt
voynich_unified_findings.txt		voynich_unified_findings.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Voynich Manuscript — Deciphered

Doctor's Personal Alphabet

Reasonable Confusion

What We Found (Verifiable)

What We Propose (Preliminary)

Translation Coverage

Files

Related Work

Signals & Noise

Character Energy Analysis

Status

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Voynich Manuscript — Deciphered

Doctor's Personal Alphabet

Reasonable Confusion

What We Found (Verifiable)

What We Propose (Preliminary)

Translation Coverage

Files

Related Work

Signals & Noise

Character Energy Analysis

Status

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages