Those are iconic words to any Star Trek fan—it’s the preferred drink of Captain Picard, as ordered from the Enterprise’s always-listening computer system. They also represent a vision of voice-activated, ubiquitous computing interfaces that took hold in sci-fi books and film nearly 70 years ago.
It’s taken a long time for our world to sync up to Picard’s, but with the advent of Kinect and voice recognition systems, it’s finally happening. "Writers from Heinlein to Doctorow envisioned a far more heads-up, cooperative, and simple way of engaging with technology," explain a team of Frog technologists behind RoomE, a heads-up computing project. “We think we’ve also reached a point in our technological evolution that will allow many of these visions to become real.“ Devised and built by Frog Fellow Jared Ficklin, RoomE is one of the first working examples of a type of ubiquitous computing interface only imagined for decades.
Installed at Frog’s Austin offices, RoomE’s hardware is all off-the-shelf: two Kinects provide an array of voice and motion sensors, while a series of projectors are positioned to turn any surface into a screen. The software is custom-built, using Microsoft Speech Recognition Engine, Computer Vision, and the Kinect SDK. “A lot of people seem to be working on various pieces, but no one has yet to combine them,” says Ficklin. “That’s one reason we had to build one for ourselves.” On YouTube, Ficklin uploads videos of himself pointing to certain lamps in an office and, in a friendly Texas lilt, telling RoomE to “turn on those lights.” He also turns them back on with a hand motion and then orders takeout from Yelp, tweets, controls the thermostat, and checks out the CCTV feed from the backyard—all using voice commands. "RoomE leverages the elegance of context," the team at Frog explains. "The system knows who is in the room and what is in the room via computer vision. Therefore when the command is issued, RoomE calculates where the command came from, and, by putting the context together, the results can be placed where they best serve the user."
Star Trek and Doctorow aren’t the sole basis for RoomE—contemporaneously speaking, the social cost of head-down computing was also an important jumping-off point for developing a radically new interaction model. As handheld technologies have become more ubiquitous, the behavioral dilemmas associated with them have intensified. On the one hand, smartphones and tablets make us more efficient. On the other hand, they make us anxious in a whole host of ways, from the rude friend who texts during dinner to the stress of choosing a restaurant from a Yelp page only one member of the group can see. The strong reaction to Google Glass illustrates further the as-yet-unknown social impact that will inevitably arise from the computing interfaces of the future. “These emotions—fear, shame, frustration, anxiety—all betray an experience gap,” says Ficklin. “The next pattern of computing will move in to close that gap.”
In truth, ubiquitous computing isn’t all that intuitive to humans. Our instinct is that a computer must be, like our bodies, a set of systems enclosed in a protective shell. But as we move beyond that computing pattern, we’ll begin to see computers that act like ecologies, rather than single organisms. A networked ecology of language and gesture sensors might not feel particularly natural to us today, but as an interaction model, it’s far more humanistic. "[It] has the potential to be more heads-up, allowing users to be present in their environment," the designers explain. "Shared input and output offer solutions to the kinds of problems that arise when multiple people try to accomplish the same task via personal devices." In some ways, Google Glass is attempting to solve the same problems by putting a screen between the user’s eye and the world. RoomE, meanwhile, makes the world itself into a screen.
When RoomE—or its offspring—does come to market, it’ll likely be as a set of technologies that make it possible to retrofit your home, piece by piece. Ficklin points to precedents like LuminAR, an MIT-born company that has hybridized a desktop lamp with a Pico projector and a camera to create an interface on any surface. “We imagine a form factor like a lightbulb with LED lighting, a camera, and a projector all built in so any surface under lighting could become a display,” he says. The lightbulbs you buy at the store may one day contain integrated sensors and micro-projectors, while light switches might be bundled with sensors and microphones.
For now, RoomE will remain internal, a teaching model and proof of concept spurred by the team’s mantra, "Think by Making. Deliver by Demo." But it’s easy to imagine the impact of the system at a broader scale. "Eventually, room-size computing will touch everything," Ficklin explains. "Where? Anywhere you bring out your smartphone, but then feel slightly shamed by bringing out your smartphone. Anywhere you bring out your smartphone but wish you did not have to unlock it to do what you are about to do. Anytime you hand your smartphone to someone else but really hope they don’t hit the home button and fire up your email and you have to watch them like a hawk. Anywhere you wish you could plug your smartphone into a projector."
Someday, systems descended from experiments like RoomE will transform how we interact with computers on a citywide scale. For now, it’s enough to work out the kinks involved in simply turning off a light. “We feel we’ve found a great new sweet spot between virtuoso hand waving or having to sound like Captain Kirk,” Ficklin says. “Sometimes there is a lot of value in closing the small gaps.”