"Accessibility is a basic human right," Eve Andersson tells me, sitting on a lawn at the Shoreline Amphitheater during this year's Google I/0 developer conference. "It benefits everyone."
Soft spoken and ginger-haired, Andersson is the head of Google's accessibility efforts—the gaggle of services that Google bakes into its products to allow them to be just as usable by people with disabilities as they are by the general public. Under Andersson's leadership, Google has made Android completely usable by voice, teamed up with outside vendors to give Android eye-tracking capabilities, and launched the Google Impact Challenge, a $20 million fund to generate ideas on how to make the world more accessible to the billion-odd individuals in the world living with disabilities.
According to Andersson, accessibility is part of Google's core mission to catalog the world's information and make it available to everyone. As you speak to her, a point she hammers home over and over again is that inclusive design means more than just hacking an app or product so that people with disabilities can use it. It's something that benefits literally everyone.
I ask Andersson for an example. We're sitting on the grass in a sunny spot, so she pulls out her phone, and shows me an app. Even though the sun is shining directly overhead, I can still read the screen. "This is an app that follows Android's accessibility guidelines for contrast," she says. And while those guidelines were established to help those with limited vision see what's on their screen more easily, it has a trickle-down effect to everyone who wants to use their smartphone in the sun.
In a way, Andersson argues, the accessibility problems of today are the mainstream breakthroughs of tomorrow. Autocomplete and voice control are two technologies we take for granted now that started as features aimed at helping disabled users use computers, for example. So what are the accessibility problems Google has its eyes on now, and what mainstream breakthroughs could they lead to tomorrow?
Like Microsoft, which recently announced a computer vision-based accessibility project called Seeing AI, Google's interested in how to convey visual information to blind users through computer vision and natural language processing. And like Microsoft, Google is dealing with the same problems: How do you communicate that information without just reading out loud an endless stream-of-conscious list of what a computer sees around itself—regardless of how trivial they may or may not be?
Thanks to Knowledge Graph and machine learning—the same principles that Google uses to let you search photos by content (like photos of dogs, or photos of people hugging)—Andersson tells me that Google is already good enough at identifying objects to decode them from a video stream in real time. So a blind user wearing a Google Glass-like wearable, or a body cam hooked up to a smartphone, could get real-world updates on what can be seen around them.
But again, the big accessibility problem that needs to be solved here is one of priority. How does a computer decide what's important? We're constantly seeing all sorts of objects we're not consciously aware of until some switch flips in our heads that tells us they're important. We might not notice the traffic driving by until one of those cars jumps the curb and starts speeding toward us, or we might not notice any of the faces of the people in a crowd except that of a close friend's. In other words, our brains have a subconscious filter, prioritizing what we notice from the infinitely larger pool of what we see.
Right now, "no one has solved the importance problem," Andersson says. To solve it means figuring out a way to teach computers to not only recognize objects but to prioritize them according to rules about safety, motion, and even user preferences. "Not all blind people are the same," Andersson points out. "Some people want to know what everyone around them is wearing, while others might want to know what all the signs around them say."
As Google develops ever-more-powerful AI, a time may come when computer vision replaces human sight. That day isn't here yet, but Andersson points out that any work done on computer vision for accessibility projects will have a clear impact upon the field of robotics, and vice versa. The robots of the future might be able to "see" because of the accessibility work done in computer vision for blind people today.
Much has been made recently of Google's advances in natural language processing, or Google's ability to understand and transcribe human speech. Google's accessibility efforts lean heavily upon natural language processing, particularly its latest innovation, Voice Access. But Andersson says computers need to understand more than just speech. Forget natural language processing: computers need non-language processing.
"Let's say you are hearing-impaired, and a siren is coming your way," says Andersson. "Wouldn't you want to know about it?" Or let's say you're at a house party, and an entire room of people is laughing. A person with normal hearing would probably walk into that room because it's clear there's something fun happening there; a hearing-impaired person would be left in the dark. Our ability to process and understand sounds that aren't speech inform a thousand little decisions throughout our day, from whether or not we walk out the door in the morning with an umbrella, to whether or not we get a rocket pop when the ice-cream truck drives by on a hot summer day. If you were hearing-impaired, wouldn't you want access to that stream of information?
There's no technical reason machine learning can't be turned on the task of understanding more than just speech, says Andersson. But machine-learning neural networks, like the ones driving Google's computer vision efforts, depend on vast data sets for training. For example, for Google Photos to learn what it was looking at in a particular photograph, it had to train on a massive database of millions of images, each of which had been individually tagged and captioned by researchers. A similar data set was used for Google's natural language processing efforts, but Andersson says there's just no comparable data set that currently exists to teach neural networks how to decode non-speech audio.
Andersson wouldn't say if Google was currently trying to build such a database. But the usefulness of non-language processing goes way beyond accessibility. For example, YouTube can already auto-caption the speech in a video; mastering non-language processing could help it caption the sound effects, too.
Sighted users are so used to taking directions from computers that many people (like me) can barely find their way around without first plugging an address into Waze. But moving sighted individuals from point A to point B, across well-plotted roads and highways, is navigation on macro scale. Things get much more complicated when you're trying to direct a blind person down a busy city street, or from one store to another inside a shopping mall. Now, you're directing people on a macro scale, but in an environment that is not as well understood or documented as roads are.
Google's already working on technology to help address this problem, says Andersson. For example, Google's Advanced Technology and Projects Group (or ATAP) has Project Tango, a platform which uses computer vision to understand your position within a three-dimensional space. Before Project Tango, Google could detect a user's position in the world using all sorts of technologies—GPS, Wi-Fi triangulation, and beacons, among them—but none of them were precise enough for reliable accessibility use. For example, your smartphone's GPS might place you 30 feet or more away from where you actually are. Not a big deal if you're driving to the movies in your car, but a huge problem if you're a blind person trying to find a bathroom at the airport.
But while Project Tango is an important step in the right direction for using computer vision as a navigation tool, more needs to be done, says Andersson. Indoors, Google needs to collect a lot more data about internal spaces. And everywhere, a lot of the problems that still need answering are UX problems. For example, let's say you need to direct a blind person to walk 50 feet in a straight line: How do you communicate to him or her what a foot is? Or think of this another way. You know how your GPS will tell you when you're driving to "turn left in 1,000 feet?" Even when you can see, it's hard to estimate how far you've gone. Now imagine doing it with your eyes closed.
But these problems are all solvable. And when they are, like most accessibility advancements, they will have obvious benefits even to those without disabilities. After all, how often have you needed precise directions to find your car in a packed airport parking lot? "I'm passionate about accessibility, not just because I believe in a level playing field," says Andersson. "But because [inclusive design] makes life more livable for everyone."