Come On, Microsoft: Siri Is Making You Look Terrible

Last week, Microsoft overhauled the Xbox 360. The update brought dozens of new features, but there was one I was particularly excited about: when paired with a Kinect, the new interface was said to pack voice recognition support pretty much everywhere.

As I noted in my initial Kinect review well over a year ago, the Kinect’s voice system was the one bit I found particularly disappointing. After finally seeing someone do something right with voice with Siri, the idea that the 360 might be getting a wonderful voice interface had me beyond excited.

Alas, it still sucks.

Now, lets get one thing clear: I really, really like the Xbox 360. It’s one of very, very few devices I consider a personal favorite. If I had to up and sell everything I owned, the 360 would be one of the last things to go. I’ve owned one since launch day, and have ran three of them into the ground since. Suggesting I have a bias against the 360 (as commenters are wont to do whenever we criticize anything) would be like saying Google execs have a bias against private jets.

But this new voice crap… yikes.

If Siri’s greatest strength is that it (generally) allows users to speak naturally, the 360’s newfound greatest weakness is the exact opposite: it requires people to speak like robots. Very dumb, slow-speaking robots who have no idea what they want until presented with a finely groomed list of options.

Take my battle last night, for example. I wanted to watch the TV show Grimm, which is available on Hulu.

How might I do this? “Xbox, play Grimm on Hulu,” right? Hellll no.

Without searching on Bing (we’ll get into that later), the process was: Turn on Xbox. Wait 15 seconds for the Kinect to start up. “Xbox, Quickplay” (only because I’d loaded Hulu recently. Otherwise it’d be a few more commands.), “Xbox, Hulu Plus.”, “TV.”, “Title A-Z.”, “View All”, “G.” (It hears “J”), “Go Back”, “G.”, “Next Page”, “Next Page”, “Next Page”, and finally: “Grimm”, “Resume Show”. 13 commands to do one action.

Why not just search in Hulu with my voice and be done with it? Because you can’t. You can bring up the search interface… which promptly tells you to grab your controller, because voice isn’t supported here. Because that process totally makes sense.

So, why not just search via Bing on the homescreen, as you now can? I tried. Oh, how I tried.

“Xbox, Bing Grimm”. Results: Anne Graham.

“Xbox, Bing Grimm”. Results: Andy Grammar.

“Xbox, Bing G-R-I-M-M”. Results: C’est ne ma’am. What the hell?

Now, that’s just an example of the interface’s failings; it’s not to say that Siri would handle this specific goal any better. While it turns up some third-party results (Yelp, for example) Siri can’t actually search for content within third-party apps at all. With Siri, however, I’m constantly being surprised with what I can do. With the 360, I’m constantly surprised by what I can’t. Siri can set my alarms, send my texts, find nearby businesses — and heck, it’ll tell me a joke. The 360 can’t even do the things it outright says it can.

To put it another way: in the 10 days since the new UI rolled out, I haven’t been pleasantly surprised by a voice interaction once. I have, however, given up on voice and reached for the controller more times than I can count.

Think of the Kinect’s advantages: it has access (in most cases) to a constant Internet connection with relatively massive bandwidth. It has constant power, rather than working around battery drainage. It’s running on an exponentially more powerful device, with (stand-alone!) listening hardware that essentially never has to move. The 360 never has to figure out what I’m saying in a crowded bar, while driving, or at a ball game; it just sits in my living room, with an absurd amount of power behind it.

The 360’s new voice interface should be twice as smart, twice as fast, and twice as surprisingly wonderful as Siri. Instead, it just makes me want to break my TV.