Gesture, posture, expression, clothes: These are tiny social considerations that give humans the incredible power of inference, but still elude computers. Computers can’t discern if you are looking at the screen, let alone whether you seem interested in whatever's on it. They can’t even tell if that’s really you or a motorcycle gang at the keyboard, dressed in sweats or leather.
Microsoft Research—the more experimental arm of Microsoft—wants to change that. In a lab, Eric Horvitz and Dan Bohus are giving artificial intelligence the ability to discern social cues and, in turn, cater to our more nuanced needs. Microsoft has just released a video that gives us a deep dive into this research.
One project the researchers shared is an elevator bank with no buttons. You walk up, and it opens. Sounds simple enough, right? You could do that with the motion sensors people have on garage doors. Except not really, and herein lies the nuance of what Microsoft Research is doing.
The system has to discern whether someone is just walking by the elevator bank, or is walking up to the elevator bank with the intent of pushing a button and riding the elevator. So Microsoft programmed a camera system to read body language and anticipate what to do. The system can predict that behavior about three seconds before the doors need to open.
In another demo, researchers programmed a robot to hang out at Microsoft Research and give people directions. The robot can distinguish individuals in a group by sight, and can read through facial expression, not only if you’re looking at it, but actively paying attention.
Something as simple as seeing where humans are looking can be powerful. It allows the robot to make eye contact and point the way. Reading your expression holds even more possibilities. Imagine artificial intelligence that could see if you were bored, embarrassed, or offended. It could allow software to tailor experiences, and conversations, to you in real time. Imagine reading a Wikipedia article on lemurs. If you were totally fascinated, maybe software like Microsoft's could recognize that and pull more information automatically.
And the robot can also look a bit sad when you walk away.
In the researchers' final public demo, a CGI virtual assistant keeps track of Horvitz’s calendar. It can tell you if he’s in the office or busy. And it can predict things like when he’ll be done with a meeting based upon how long his past meetings have taken.
Granted, the demo looks a bit like Max Headroom. But don’t underestimate the crude animation. Because she’s reading you like a book.
Microsoft says that this technology could make its way into Cortana (the company's answer to Apple's Siri) which lives inside the latest generations of Windows Phones. And as strange as it is to think that a phone might look at you like the Terminator, this would be a good thing. Because if tech companies are so interested in having us speak to software, then software is going to have to become more interested in us.