The Voynich manuscript

By current estimates, the book originally had 272 pages in 17 quires of 16 pages each. About 240 vellum pages remain today, and gaps in the page numbering (which seems to be later than the text) indicate that several pages were already missing by the time that Voynich acquired it. A quill pen was used for the text and figure outlines, and colored paint was applied (somewhat crudely) to the figures, possibly at a later date. There is strong evidence that at one point in time the pages of the book were rearranged into a different order.

The text was clearly written from left to right, with a slightly ragged right margin. Longer sections are broken into paragraphs, sometimes with “bullets” on the left margin. There is no obvious punctuation. The ductus (the speed, care, and cursiveness with which the letters are written) flows smoothly, suggesting that the scribe understood what he was writing when it was written; the manuscript does not give the impression that each character had to be calculated before being inked onto the page.


The text consists of over 170,000 discrete glyphs, usually separated from each other by narrow gaps. Most of the glyphs are written with one or two simple pen strokes. While there is some dispute as to whether certain glyphs are distinct or not, an alphabet with 20–30 glyphs would account for virtually all of the text; the exceptions are a few dozen rarer characters that occur only once or twice each.

Wider gaps divide the text into about 35,000 “words” of varying length. These seem to follow phonetic or orthographic laws of some sort; e.g. certain characters must appear in each word (like the vowels in English), some characters never follow others, some may be doubled but others may not.


Statistical analysis of the text reveals patterns similar to those of natural languages. For instance, the word frequencies follow Zipf’s law, and the word entropy (about 10 bits per word) is similar to that of English or Latin texts. Some words occur only in certain sections, or in only a few pages; others occur throughout the manuscript. There are very few repetitions among the thousand or so “labels” attached to the illustrations. In the herbal section, the first word on each page occurs only on that page, and may be the name of the plant.

On the other hand, the Voynich manuscript’s “language” is quite unlike European languages in several aspects. Firstly, there are practically no words comprising more than ten glyphs, yet there are also few one- or two-letter words. The distribution of letters within the word is also rather peculiar: some characters only occur at the beginning of a word, some only at the end, and some always in the middle section – an arrangement found in Semitic alphabets but not in the Latin or Cyrillic alphabets (with the exception of the Greek letters Beta and Sigma).

The text seems to be more repetitive than typical European languages; there are instances where the same common word appears up to three times in a row. Words that differ only by one letter also repeat with unusual frequency.

There are only a few words in the manuscript written in a seemingly Latin script. In the last page, there are four lines of writing which are written in (rather distorted) Latin letters, except for two words in the main script. The lettering resembles European alphabets of the 15th century, but the words do not seem to make sense in any language. Also, a series of diagrams in the “astronomical” section has the names of ten of the months (from March to December) written in Latin script, with spelling suggestive of the medieval languages of France or the Iberian Peninsula. However, it is not known whether these bits of Latin script were part of the original text, or were added at a later time.

