visual robotics
Tango

And Now For Your Smartphone’s Next Trick: Seeing And Understanding, Courtesy Of Google

Next Story

Inside The Revolutionary Computer Vision Chip At The Heart Of Google’s Project Tango Phone

And Now For Your Smartphone’s Next Trick: Seeing And Understanding, Courtesy Of Google

Your smartphone hosts a bevy of sensors that do many things within its sleek case. But thanks to a new project, dubbed Tango, by Google’s Advanced Technology And Projects (ATAP) group, your next one might have one more superpower: visual spatial awareness. As in, your next smartphone might be able to not only see, but also to understand its surroundings.

If that seems like something out of science fiction, that’s because it very much could be – the AI assistant in Her, for instance, manages to be so convincingly human because it shares an awareness of the user’s world, including recognizing the environment and putting that into the proper context. Google’s new prototype hardware and development kit, with its Myriad 1 Movidius Vision Processor Platform, is an attempt to give mobile devices exactly this type of “human-like understanding of space and motion,” according to ATAP Program Lead Johnny Lee.

This Isn’t Just A Fancy Camera

To be clear, this isn’t just a new camera for smartphones – it’s more like the visual cortex of the brain, made into a device component like any accelerometer or gyro currently found in smartphones. And to some extent, that’s the battleground for next-gen mobile technology; Apple has its M7 motion coprocessor, for instance, and Qualcomm is building advanced camera processing tech into its SoCs that will allow users to alter focus on pictures after the shot and do much more besides.

Screen Shot 2014-02-20 at 11.30.51 AMBut Google’s latest experiment has the potential to do much more than either of these existing innovations. Computer vision is a rich field of academic, commercial and industrial research, the implications of which extend tendrils into virtually every aspect of our lives. Better still, the tech from Google’s partner used to make this happen is designed specifically with battery life as a primary consideration, so it’s designed to be an always-on technology, rather than something that can only be called upon in specific situations.

Contextualizing Requests

So what will these mean in terms of user experience? It’s trite but true to say “expect big changes,” but at this stage the push is to get developers thinking about how to make use of this new tech. So it’s still early to say exactly what kind of software they’ll build to put it into practice. One thing’s clear, though – context will be key.

Google Now has provided some hint of what’s possible when a smartphone has a thorough understanding of its user, and what said user’s needs might be, given time, location and inputs including email, calendar and other overt signals. Combined, those can present a pretty good picture of what we call ‘context,’ or the sum total of circumstances that make up any given situation. But ultimately, as they operate currently, your smartphones are effectively working within black boxes, with pinholes cut out sporadically across their surface, letting through shafts of light that partially illuminate but don’t necessarily truly situate.

With an understanding of surroundings, a virtual personal assistant could know that, despite being in the general vicinity of a bus stop, you’re currently shaking hands with a business contact and dropping your bag ahead of sitting down for coffee, for instance. In this situation, providing local bus arrival times isn’t as important as calling up that email chain confirming the meeting in question, for instance.

boxes3

But the value of visual awareness to virtual personal assistants is just one example, and one that’s easy to grasp given recent fictional depictions of the same. Knowing where a phone is being used could also help to bring more far-out concepts to life, including games that change settings, surroundings and characters depending on where they’re played; situational advertising that interacts with nearby multimedia displays and addresses/engages a user personally based on where they are and what types of products they have to be looking at; even macro level settings change to make sure your phone is prepared to suit your needs given your current circumstances.

As for that last example, it could help usher in the type of dynamic mobile OS that Firefox, Google and others have clearly been toying with. Contextual launchers, I argued on a recent TechCrunch Droidcast, aren’t ready for primetime because too often they get the context wrong; again, they’re working inside a black box with only brief and sporadic glimpses to make sense of the world around them, and this new tech stands to bring them out into the open. The result could be devices that don’t need to be manually silenced, or that automatically serve up the right home screen or app for what you need without having to be told to do so.

Immense Data Potential

Of course, Google wouldn’t be doing Google if this project didn’t have a data angle: The type of information that could potentially be gleaned by a device, carried everywhere by a user, that can not only see but also make sense of its surroundings is tremendously powerful.

Google’s entire business is built around its ability to know its customers, and to know what they want to see at any given time. The search engine monetized on the back of extremely targeted web-based text ads, which return highly relevant results whenever a user types a query into their engine – an almost foolproof sign that they’re interested in that topic. It sounds like a no-brainer now, but at the dawn of search, this simple equation struck the entire ad industry like a lightning bolt, and it continues to drive behemoth revenue for Mountain View.

Even Google’s famously ambitious “moonshots” all have a thread of that original goal behind their impressive facades of consumer potential, and Tango is no exception. Stated intent, like that expressed by people Googling things, is only one part of the equation when it comes to sussing out a consumer’s desires – anticipating needs before they arise, and understanding needs that a person might not even be aware they have, make up the broader blue sky opportunities for a company like Google. In that context, having a contextually aware smartphone that can observe and interpret its surroundings is almost like putting a dedicated market researcher in the room with any given shopper, at any given time.

As with every single noteworthy mobile tech development of the past decade at least, Project Tango will seek greater access to a user’s life in exchange for more and better services rendered. And once developers start showing us what’s possible once a smartphone can understand where it is and what’s going on, I’m willing to bet users find the cost in data perfectly acceptable.

boxes2

Google’s Not Alone

Google isn’t the only company that’s working on perceptive mobile devices, and it won’t be the only one that helps bring gadgets with visual intelligence to market. Apple acquired PrimeSense last year, a company that builds motion-sensing tech and helps map 3D space. Qualcomm purchased GestureTek, a similar company, back in 2011.

Location-based tech seemed sci-fi when it was first introduced to mobile devices, but now it’s de rigueur. The same will be true of contextual awareness with the devices we buy tomorrow. Google is the first one to start putting this power in the hands of developers to see what that might mean for the future of software, but it won’t be the only one. Expect to see development come fast and furious on this new frontier in mobile tech, and expect every major player to claim a seat at the table.

Illustrations by Bryce Durbin