With all the talk of what AI will do to change the world, you might not notice the subtle ways it’s already woven into our lives. Case in point: PowerPoint Designer, a feature that lives in the current version of the software. Each time you create a new slide, Designer is invisibly scanning your content, trying to figure out better ways to make that content shine based on millions of other PowerPoint presentations. Click on the Designer tab, and instead of your haphazardly pasted picture and bullet points, you might see three different options, with better typeface choices and a frame around the image that matches its tone. What might have seemed futuristic a few years ago is now dead-simple to use. But that simplicity belies something strange and fascinating: When Microsoft was first testing Designer, it actually felt utterly and ineffably off.

“In the way it was first tested, the tone of the words and animations in Designer made it feel like the computer knew better than you,” explains Jon Friedman, who as partner director of design at Microsoft leads the vision for the company’s Office suite. There was something stranger still: If you kept following Designer’s recommendations, the end result was a presentation that didn’t feel like you’d made it anymore. The computer, it seems, simply wasn’t taking into account what the rest of your presentation looked like. Instead, it was taking over, step-by-step. Eventually, Microsoft fixed that problem, unveiling a subtly more helpful, more neutral feature that made more context-sensitive recommendations. But behind those changes lay one of the company’s governing principles for AI design: that humans be the hero of any story.

Those principles seem obvious enough, when you lay them all out: “Humans are the heroes,” “Balance EQ and IQ,” “Honor societal values,” “Respect the context,” and “Evolve over time.” But behind them lies a unusual origin story–one that tells you a lot about where design is going. The principles didn’t emerge fully formed. Rather, they were the end result of a process started over five years ago, in which Microsoft spent untold millions trying to make a better AI assistant by watching how actual human assistants gain the trust of their clients.

Watching Trust Form In The Wild

Five years ago, Microsoft was busy trying to come up with a competitor to Apple’s Siri when the design teams noticed something strange about the prototypes they were testing with users. They had two basic flavors of assistant: one in which the user helped train the assistant, and one that simply guessed what a person needed and spat it out. It turns out users were far more forgiving of the former–and not particularly interested in the latter even if it performed equally well. There was something about training up the assistant that made people more forgiving of its mistakes.

“To study that dynamic, we started interviewing real assistants, asking them to reflect on their relationships and how their tasks evolved over time,” explains Ronette Lawrence, Microsoft’s senior director for product planning and research. That research then evolved into studying assistants new to the job, watching them as they formed relationships with their new clients, and asking them to keep journals of their feelings. But getting at the truth of those relationships was a tricky business. Typically, humans are poor at articulating their emotions in the moment, and so the research methods revolved around asking people to think of music or art that captured their moods and sentiments. “It’s well-formed science that if you get someone to think about music, and the emotions that music brings up, people will get closer to the underlying emotions they feel,” explains Lawrence.

All that research came at a turning point for machine learning in the wild. This was the era when Google Now was just beginning to release features such as predictive forecasts for your commute or how long you might be waiting at a bus stop. Lawrence’s team noticed that users found those features magical–and yet if they were wrong just once or twice, would opt to turn them off altogether. Seeing such responses to even small error rates, Lawrence’s team realized that this idea of trust was thorny, and important: Unless the human felt some kind of connection to the machine, they’d never give it a chance to work well after it made even one mistake.

That insight dovetailed with the assistant research they were doing. Time and again, it turned out that assistants succeeded not by being smarter than their clients, but rather by deferring to them. This applied directly to how a machine assistant might behave. “The more tasks you take away from people, the more you have to watch for emotions like, ‘did it make me feel more powerful and smarter, or do I just feel like the system is smart,” explains Lawrence. “Having people use a system that feels more powerful than them sets up this dynamic where it’s not your partner anymore. It brings up concerns about whether the machine is working on your behalf.”