Co.Design

The iPhone 4S's Siri Is The Ultimate Interface: None At All

The "humble personal assistant" may represent the beginning of the end of Apple's UI.

Apple’s hotly anticipated iPhone event didn’t offer the bold design refresh that many (ourselves included) expected. The iPhone 4S is an incremental upgrade in terms of hardware, full of fat new tech specs--an A5 processor, a souped-up camera, iOS 5--jammed into a package that looks exactly the same as the last one. But iOS chief Scott Forstall still saved the best for last: Siri, a voice-controlled "humble personal assistant" app that points toward the omega point of Apple’s quest to perfect human-computer interface design. (Apple acquired the technology in April 2010.) Put simply, the interface will be gone. There will be no interface: You will talk to your iThing, it will understand what you mean, and it will do it. It’s the logical endpoint for a company whose design philosophy is "it just works." Siri itself may not be powerful enough to embody this vision completely and flawlessly right now, but it’s the first step--the beginning of the end.

Obviously, voice commands have been a smartphone staple for a while now. I’m an Android user, and it never ceases to amaze me how Google’s voice recognition servers can parse the name of my local French restaurant, despite the fact that I mangle the pronunciation and there’s a wall of street noise behind me. But barking search terms--or even spoken commands--is different from talking. If Siri works as advertised, it represents a quantum leap in interface design because actual, y’know, interfaces--those sometimes sexy, often incomplete, always indirect middlemen between what we want and what we get--will inevitably seem clumsy and thick compared to the effortless, transparent, everyday miracle of speaking in natural language.

Every decent futurist-vision of man/machine interfaces--from the nameless ship’s computer in Star Trek to the murderous HAL in 2001 to the gentle Gertie in Moon--has converged on direct, natural language input as the logical endpoint for mainstream, non-specialty computing. (There will probably always be a need for sophisticated uber-interfaces to accomplish specialized tasks--think Tom Cruise in Minority Report--but for the near-future equivalent of what most people think of as "computing," there’s little that a natural language UI can’t do most of the heavy lifting for.) Computers, especially mobile computers, are becoming our outboard brains--and we already "designed" the ultimate interface between brains tens of thousands of years ago, when homo sapiens started talking to each other around the campfire. Short of direct neural implants becoming common (don’t hold your breath), it only makes sense that eventually our information-processing appliances would start conforming to our most powerful way of communicating. Siri is based on technology originally funded by DARPA, the military’s bleeding-edge mad science arm, and you know they don’t mess around.

So, does it really work? Or will Siri be this year’s version of AntennaGate? Natural language interfaces, by dint of the ultra-high bar they set in terms of their accuracy, are essentially pass/fail experiences for the average user: either they perfectly, invisibly work, or they’re a total bomb. (What good is a "natural" language UI if you can never tell when it’s going to "get it right" or not? Stabbing with your thumbs is faster.) With Siri, Apple has put its user-experience neck out in a way that makes the radical redesigns of OS X Lion feel like small potatoes. It’s saying, "We know this is the only place for user-friendly computing to go. We might as well get there first." Whether it succeeds or not is anyone’s guess, but the ambition and vision is classic Apple. Once again, the company appears to have done what it always does--take something that’s already existed and refine it into what it feels like it should be. And it looks like other smartphone makers had better start playing catch-up--again.

[Kevin Purdy will have a detailed review of Siri and the iPhone 4S in the morning on FastCompany.com, so stay tuned.]

Add New Comment

28 Comments

  • eshaper

    Great article. Anyway, I don't agree with Siri as the beginning of the end. 
    If we think of Accesibility as a whole, this is an awesome feature for blind persons but it's useless for voiceless people.

  • tweeterwoofer

    Great piece. Well written and informative. I think that Siri is really more about your "personal" relationship with your device. It sounds ridiculous to refer to it that way, but on some level there is already a certain level of intimacy present with our mobile devices. I can't tell you how many people I've heard say they feel lost without their smart phones. We've come to depend on them as our communicators and our connection to the outside world and they have become an integral way of how we get things done. I'm sure that there will be people who use Siri all the time and in full Public view, just as there are currently people who conduct their phone conversations that way, but I imagine there will be many more who use it in a more private way. In the same way that there is a certain level of phone edicate that sends one outside of a restaurant to a more private location to take a phone call, I think people will adopt these sort of behaviors with Siri as well.

    Most people spend at least some time everyday in a car, often alone. Siri's usefulness in these situations is really where it's strength is and will probably be where it is used most often. So many people already try to send and read emails, texts, and conduct Internet searches for various info while driving. Siri, provided it works reasonably well most of the time, will provide a much safer and more efficient means of completing such tasks. I don't see it replacing existing interface forms, but rather complimenting them. It simply becomes another way of interfacing when touching a screen and typing are not desirable.

  • Jasper

    Will the siri application work in different languages? for example Norwegian? 

  • Marco Acevedo

    So we will now be talking to our devices. Talking. 
    Oh, the cacophany, the horror.

  • 04 Yam Fz6

    I wonder if you'll hear a bunch of teenagers saying "B-R-B" or "T-T-Y-L" out loud to their phones.

  • keith

    This is a misunderstanding of what an interface is. Siri has TWO interfaces: an aural one, and a visual one. Its should be obvious from the demo that the visual feedback provided by Siri is still an important component of the interface.

    Siri mainly makes it easier to INPUT data into the device, because speaking is easier than using your hands (especially in specific scenarios like driving, cooking, etc).

    However it is still easier to understand OUTPUT through visual interfaces since our eyes and reading skills are far faster and more sophisticated than our ears.

  • Coy Chen

    Siri still should be called interface, it's not visual interface, but it's still an interface between users and the virtual world. 

  • phil_hendrix

    John, good post.  As you acknowledge, much of Assistant's success will depend on how well Apple executes, both the voice UI but also the AI capabililities.  We are bullish on the prospects (see "Is the iPhone Assistant Apple's Next Blue Ocean?" at http://bit.ly/nNuOYR) and for voice UIs in general (see report at http://bit.ly/xW9tv and post at http://j.mp/iGNBqk). Clearly, the "suitability" of various UIs, including voice and gesture, varies markedly by situation.  While the best use case for voice is while driving, there are other criteria that determine suitability and usefulness (see Framework for Evaluating and Prioritizing Opportunities for Speech-Enabled Apps at http://bit.ly/n9svr3). Despite the prospects, the jury is still out on whether Apple has "hit the sweet spot" with Assistant - Marshall Kirkpatrick (@marshallk) of ReadWriteWeb has a good critique at http://rww.to/nMrBMn.  I'll post this comment on Kevin's review as well.
    Dr. Phil Hendrix, immr and GigaOm Pro analyst

  • Daniel Ostrower

    Kevin, Great job putting the development in historical context and clearly articulating its significance. Great piece.

  • Djc123

    Its never going to work.  You've got to be on your own to do it, otherwise its totally obnoxious, texting and checking website info etc can be done discretely whilst still in conversation.
    If someone is the room next to you, they are going to keep coming in saying "Sorry, did you just say something?"
    "Oh, i was just checking flight times and sending a text."

    Good in principal, but never going to work, just like the voice functions that already existed for the 3GS and 4.  I thought it was brilliant, was driving, and then said:
    "play more like this"
    "diallling 077463826592"

    I hope in years to come it works, one step closer to HAL.  I can tell theres going to be a lot of burnt meals when someone prompts their iPhone to set a timer for 13-19minutes, and for it to go ahead and set a time between 30-90.
    Dialect will affect it, accent will affect it.

    This is why we have the phonetic alphabet alpha bravo charlie, abc etc.  Because its hard to distinguish certain sounds and avoid confusion.

    A lot of it seems pointless too.  If you have put the effort into making a cake, your going to have a timer, or can set it yourself.  A lot of recipes specify amounts in ml, fl.oz, g, tsp, dsp, etc, so no need for that.

    If you are out running, and its urgent, they will phone you and you will answer.  If its not urgent, they will text you and you can reply within 30 minutes, unless you are haile gebrselassie and won't be back for 6 hours.

    If i was in a new town and wanted some restaurant, i wouldn't want to air it everyone that i didn't know where i was.

    What would be good is if the iPhone could detect sayings automatically, like "iPhone, where are you?" and have it make sounds so you can locate it.

    Has anyone experienced the new voice overs?  Do they actually work?  Or do you have to speak the queens english in an anechoic chamber with perfect diction and clarity for recognition?

  • JB

    What does it matter? From your post you already made up your mind whether it will work or not... Despite the fact you have never used the technology.

  • Tamhabeeb

    Siri will only work in iPhone 4s. So dont talk about 3GS & 4. It is a new feature for 4s only & don't talk rubbish

  • KEWE

    Wow... Does this mean my teenagers will be speaking instead of thumbing? Speaking into a phone a message that might have well been delivered with the same voice, into an old school phone and received by an actual ear... Is it possible that Apple has just created a voice text broker? The first step to getting us back to the original interface... The one that Adam and Eve used... HI (Human interface)... But then again Adam had an Apple, and Eve had an Apple right?

  • Don Keane

    Kevin,
     
    Good article and I couldn’t agree more – voice is the most user friendly functionality to be incorporated into any consumer or product because it’s easy, quick and intuitive. We have seen how the successful integration of voice in enterprise products enhances productivity in a completely natural way and I would anticipate to see similar, great results from Apple’s Siri.  
     
    -          Don Keane, Angel

  • murliman

    Dear Jack - if you had read my comment carefully (something you likely do every time except this one time - I'm being charitable here, okay?) - you might have noted that I did not claim that John Pavlus lacked journalistic skills; indeed, I suggested that he could likely do better than he did in this one piece.  For John to claim that a natural language interface like Siri is equivalent to 'no interface at all' especially when his readership includes lay, non-technical people is highly irresponsible journalism. He is misinforming the lay public by implying that something can be called an interface only when it is visual in nature.  He is contributing to the dumbing down of the public, and that is singularly infuriating. 

  • Kev

    If Siri works efficiently that would be great, but are the majority of people really going to use this? If not running or cooking, are people really going to talk into their iPhones while sitting on the couch to text someone?

  • Jack O'Neill

    No, Murliman. If you are so confident about John Pavlus' lack of journalistic skills, and you intend to vocalise your thoughts, you should have something to back it up. Also, give him a break. This was only released today. If you don't like it, don't read his posts. 

  • murliman

    Dear Guest-shilling-for-John-Pavlus: my portfolio is irrelevant to this issue: is a natural language interface an interface, or is it no interface? If you cannot answer this question straight don't even bother to engage with me.