Can you imagine typing up an email, sending it off, and being told by the recipient that its contents were gobbledygook? Of course not. For most of us, whatever we type is instantly legible to our peers.
But that’s not true for everyone on earth. If you’re part of a language spoken by only 20,000 other people, for instance, or if you’re a scholar trying to write out an ancient text, there may be no typeface for you to use at all. And even if there is, what happens as soon as you try to share that document with someone else who might not have pre-installed that font? All they can see is a bunch of rectangles.
Tofu, it’s called in the font industry.
For the last five years, Google and the font giant Monotype have been working together to fill in all those tofu squares—hundreds of thousands of them. With the help of hundreds of designers, researchers, linguists, and cultural consultants, the two companies created Google Noto (the name derives from "NoTofu"). It’s a typeface family of the world’s languages—many in the minority, or even extinct—which can be rendered in serif or sans serif across eight weights, totaling more than 300,000 unique glyphs in all. As a point of comparison, Helvetica supports 97 different languages. Google Noto supports more than 800. According to Google, Noto is nearly 10 times larger than the nearest universal typeface, Arial Unicode.
"For everyone in the typographical field, this is a scale of language and script support that’s never been attempted before," says Kamal Mansour, linguistic typographer at Monotype. "And it would be hard to imagine anyone sponsoring this other than Google. It’s a very big investment."
To understand the scope of Noto, you have to first get a firm grasp of how fonts work. Each letter of every language around the world—even our beloved emojis—has a unique number. These numbers are managed by the Unicode Consortium. Of course, many of these languages share common letters—English and Spanish have the same letters, after all. In such cases, an A in English actually shares the same number as an A in Spanish.
It’s just how it works.
If you write in a language not recognized by Unicode, then you actually apply to the organization, making a case for your distinct written language and providing all of the glyphs. Then, you might face a wave of verification and fact checking, and finally, the Consortium can admit you as a new "script." Any single script could encapsulate lots of different languages that share characters, by the way. The Latin script, for instance, includes 100 or so different languages.
But here’s the catch: Just because your language's script is listed in Unicode doesn’t mean you can actually type in that language. Remember, Unicode is just a directory of numbers. You need letters! And to type in these letters, someone has to design a typeface that corresponds to those codes.
"If you only have codes, and no fonts to represent those codes, you’re crippled," says Mansour. "You can’t really make it readable to other people. So basically, at the beginning of this project, Google wanted to make fonts available to every script that’s in Unicode. That’s a huge undertaking."
Whereas Mansour estimates that most of the world has been pretty well represented by Unicode since its founding 20 years ago, it has continuously received new applications for the last 15—no doubt fueled by people in formerly disconnected parts of the world as they get online, often via smartphones. Monotoype can't cite numbers of unfulfilled/unrepresented languages within Unicode. But over the past five years, as more and more applications poured in, the number steadily grew. For instance, Unicode support for Adlam, Urdu Nastaliq, Tibetan, and Armenian have all been added only recently.
Google wanted to fill the un-rendered holes in Unicode's libraries, and enlisted Monotype to take on the task. Monotype responded with a highly coordinated, brute force attack on the project. It consulted with academics and cultural specialists. It designed some work in-house and some out of house. Getting each project done was a hustle unto itself. When working on a script for Tibet, it had its work fact-checked by a Buddhist monastery filled with monks who’d read a lot of ancient Tibetan texts. While developing a script for Burmese, Monotoype tracked down a designer in England who’d specialized in the typography and languages of southeast Asia while at school, enlisting him to develop it.
"We use people’s experience," says Mansour. "Sometimes we’re very fortunate to find people who’ve immersed themselves in that culture." Today, Noto coverers 100% of the languages covered in the Unicode 6.1 standard (from early 2012), while they're working on getting 100% of languages in the latest version, Unicode 9.
Mansour describes Noto’s aesthetic as "uncomplicated, respectful, harmonious, and balanced," but he very carefully doesn’t call it a font or typeface. Whereas Monotoype attempted to translate notable elements of Noto across every script and language—the hooks of serifs, the distinctive geometric curves of various letters—it’s being billed as a "family of typefaces," or a series of typefaces that might not look like twins, but at least cousins.
"The idea is that any two fonts in the family should be able to cohabitate on the page, peacefully coexist," says Mansour.
Their solution was to attempt to make every single typeface match in color. In the world of typography, "matching color" doesn’t refer to changing the type from black to blue. It means that when you squint, or look at the words on a page far away, the letters should all be of roughly the same gray-scale value.
How well this worked is, frankly, almost impossible to assess. (If you have the time to cross-reference 800 different languages, let us know.) But Monotoype wasn’t prioritizing Noto purely for its matchy-matchy nature across languages. Rather, it’s built to work as a functional typeface for everyone. Within each typeface, Noto offers eight different weights, ranging from light to ultrablack, offering a respectable level of functional expressiveness. Noto, in most any language, is a real working font for the person typing in it, not just a baseline collection of glyphs.
Conversing with Mansour about the project, his voice sounds filled with an exhausted pride, after at last going public with a project half a decade in the making and with no immediate end in sight. The open-source typeface family is now open to contributors to build upon. Meanwhile, Google's mobile OS Android uses Noto for scripts not covered by Roboto's library, and Chrome OS ships with Noto standard.
Still, it raises a question: Why would Google invest such untold sums into a project like Noto? For one thing, it brings more people online, and gives them the opportunity to use Google services. But converting the world’s written languages to fonts serves another purpose. Allowing someone to type in their language, or convert an ancient manuscript from a JPEG scan to an actual document, doesn’t just put more information on the internet. It allows this information to be indexed, cited, and pulled forth in a search.
"They become Googleable!" Mansour laughs.