Data Viz Wiz Ben Fry Reconstructs “Frankenstein” Using 55,000 Font Fragments

Fry’s Frankenfont project painstakingly recreates the text of Mary Shelley’s classic book with parts of incomplete fonts found on the Internet.

Finally there’s a font worthy of the deformed monster in Mary Shelley’s famous book of horror, with data-viz guru Ben Fry playing the role of Dr. Frankenstein. At Fathom, the design studio that Fry leads, he has created an edition of Frankenstein laid out using incomplete characters and glyphs from PDF documents obtained through Internet searches. These “incomplete fonts”–typographers rarely will allow an embedded full font in PDFs because then someone could pretty easily just yank the font out of the PDF without having to pay for it–were reassembled into the text of Frankenstein based on how often they were used. The result? Frankenfont.


Here’s how they did it: For each of the 5,483 unique words in the book, Fry and his team ran a search using the Yahoo! Search API that was filtered to just PDF files. Then, they downloaded the top 10 to 15 hits for each word, which left them with 64,076 PDF files. Inside these PDFs were 347,565 fonts and from these, 55,382 unique glyph shapes were used to fill the 342,889 individual letters found in the Frankenstein text. “I’ve always found these misshapen fonts really fascinating,” Fry tell Co.Design.

In the book, the misshapen characters in the words mimic the devolution of the protagonist and his monster in the story. The most common characters like Arial, Helvetica, and Times New Roman are used at the beginning of the book, but by page 80 and 81, things have progressed (or regressed, depending on your taste in fonts) to a lot of Arial Bold and Times Italic. By page 200, commonly used script fonts and more obscure faces appear. As you get to the end of the book, the fonts have devolved significantly into non-Roman fonts, highly specialized typefaces, and even pictogram fonts.

Fry wrote some code to calculate a list of how often different fonts are found in the PDFs. First, he lined up all 342,000 letters in the book into one long list. “If the lowercase ‘e’ from Arial is three percent of all ‘e’ letters found in fonts in the PDF files, then the first three percent of all the lowercase ‘e’ letters in the list of 342,000 will be set to that same Arial ‘e,'” he explains. “It continues filling out each character like this based on usage, eventually getting down to the really odd things (that have really low percentages for how often they show up) toward the end.”

Profits from the book will be given to Donors Choose to buy books for students. More information on the project here.