Serving as a stark reminder that there are people on the Internet who are way, way too damned clever, the guys over at the iPhone design/development house Applidium claim to have cracked open Siri to take an unsanctioned look at its (her? his?) inner workings. In a rare (but quite welcome. I mean, by us. Probably not by Apple) move, they’ve gone on to do a rather detailed debriefing of how they got through.
So, what does this mean to you? Theoretically, it means that support for Apple’s voice-powered portable assistant could be hacked not only onto devices like the iPhone 4, but to anything from laptops to Android phones as well. As the italics on “theoretically” imply, though, there’s a bit of a catch.
The catch: in the end, anything attempting to communicate with Siri’s backend needs to have a valid iPhone 4S identification string, unique to each 4S. In one-off experiments like this one, spoofing that string with one pulled from an actual 4S is somewhat simple — Apple wouldn’t (/couldn’t) ever really notice.
If someone were to hack together an Android app and distribute it, though, the massive influx of requests all originating from the same unique ID would almost certainly trigger a blacklisting. Unless the app had a massive pool of authentic unique IDs to rotate through, the fishy activity would be pretty easy to discern.
I’d highly recommend reading Applidium’s full rundown of the process, but here’s the tl;dr breakdown:
- By connecting Siri to a local router and then dumping data as it came through, they realized that Siri was sending all of its data to a server that we’ll refer to as “Guzzoni”.
- All trafic sent to Guzzoni was sent through the HTTPS protocol. With the “S” in HTTPS standing for “Secure”, this traffic wasn’t subject to simple packet sniffing. So they had a new idea: make a fake Guzzoni server, and see what came through on the other end.
- After a good bit of ridiculously clever SSL certificate trickery, they got Siri sending commands to their fake server. With each command comes the “X-Ace-Host” string, which appears to be unique to each iPhone 4S.
- After figuring out how Apple was compressing (read: not encrypting) the data, Applidium was able to decompress it and parse out a rough sketch of exactly what was being sent (including which audio codec Apple was using), and what Siri expected in return.
With that process done, Applidium attempted to talk to Siri without any iPhone 4S in the equation. Their first challenge? Speech-to-text from a laptop running a custom script. Sure enough: it worked. Siri chewed through the sound file (a recording of them saying “autonomous demo of Siri”), didn’t bat an eye (as their tool was using their iPhone 4S’ actual unique ID), and returned a mountain of data detailing what Siri heard and how sure it was about each word.
Incredible. The Applidium guys have provided a few tools for others to recreate their steps — but, as it currently stands, there’s not much that can be done to take this beyond a rather cool proof-of-concept.