The iPhone 4S is on the streets, and accompanying it is a helpful young virtual assistant named Siri. You’ve probably heard something about Siri by this point, as tech blogs and the media writ large, have been yammering about Siri’s technology at full blast. Since the beginning, and even more so since Siri was acquired by Apple in 2010, there’s been a lot of excitement about voice recognition technology.
This hit fever pitch with Siri’s native launch on the 4S. Of course, Siri isn’t perfect. She’s been down and out and has experienced a backlash due to limitations in voice recognition, inability to open apps, etc. But many people (among them, one Eric Schmidt) take another stance: Siri is game-changing, and not only that, she poses a significant threat to Google (and beyond).
In his letter to the Senate Subcommittee on Antitrust, Competition Policy, and Consumer Rights, Google Executive Chairman Eric Schmidt talks quite a bit about Siri and how it represents “an entirely new approach to search technology” — one that is a “significant development”. Basically, he says that Siri is the biggest threat to Google’s control of search, well, ever. The writing, as they say, is on the wall.
So, we thought there was no better way to get the lowdown on Schmidt’s statement to the Senate Subcommittee and the future of the feisty young virtual assistant than by talking to one of the people who played an integral role in Siri’s early life.
Gary Morgenthaler, a partner at the eponymous VC firm, Morgenthaler Ventures, was the first investor in Siri in 2008 and served on the company’s board of directors until it was acquired by Apple. Morgenthaler was also an early investor in Nuance Communications, now a leader in voice recognition (and also now used in the iPhone 4S).
“Eric might be more right than he knows,” says Morgenthaler. “A million blue links from Google is worth far less than one correct answer from Siri,” he adds. These are very early days for Siri, but already he hears that “Siri’s usage has been 10x more than what Apple anticipated.” The big potential, of course, is if Apple opens up Siri to outside developers, which could create a new wave of voice-enabled apps and give Apple an edge over Android and other mobile platforms. (Morgenthaler also gets into the challenges Apple must overcome before it can open up Siri).
If people start using Siri to bypass search, that is a huge threat to Google. But how would Siri make money? It wouldn’t be from advertising. In Morgenthaler’s mind, the biggest opportunity is getting in the middle of transaction. “Corporations will be happy to skip advertising altogether, if they can go straight to transactions,” he says.
Below Morgenthaler responds to a number of questions, on Siri, the future of voice technology, and MG’s post here.
Google Chairman Eric Schmidt testified before a Senate subcommittee investigating Google’s dominance in Web search in September 2010. During the proceedings, Schmidt said Siri was a serious threat to their search business. What do you make of that?
At the time Schmidt said that, Siri was just a 3-week old beta product. While some may say it’s a smokescreen for the Senate, Eric might be more right than he knows. He’s smart and understands the potential threat of Siri to disrupt Google’s search business some time in the future. What he may not realize is just how quickly that will happen. Not only is the Siri team within Apple extremely capable, but Siri is learning every day and adding streams and streams of user data to her artificial-intelligence knowledge base.
I predict she will be able to answer and perform transactions as quickly as a human within 2 or 3 years. The real turning point will be when, and if, Apple will open their API to 3rd-party developers. Android will suddenly not be at parity with iOS – it won’t be an even playing field anymore because Google will simply not have a response to Siri’s natural language, artificial intelligence capability.
Developers will flock to Apple’s platform because they can build the next generation of cool, new apps. I hear Siri’s usage has been 10x more than what Apple anticipated – which accounts for the outages of last week I suspect. Siri has captured people’s imagination. After 50 years of sci-fi movies in which you can talk to your computers and have it understand and perform tasks for you – that time is finally here. We’ve crossed a threshold, though it’s only the beginning.
How is Siri different from what Google has currently? Google has had voice tech for some time.
Siri allows free-flowing natural language interaction with computers, whereas Google requires you to speak like a robot. Google Voice Actions (GVA) is not bad for what it does, but Siri leaves it far behind as the last vestige of the 20th century in human-computer interface. Speaking technically, GVA supports speech recognition and simple agent behaviors, whereas Siri understands language, models knowledge and applies logic, in addition. As a result, Siri creates a far more intelligent and human-like experience for users.
Beyond that, Google is robotic and devoid of personality. The Siri team went to great lengths to create an appealing persona and personality to create user engagement. In particular, we defined a lifelike virtual personal assistant with the best qualities of human personal assistants: efficient, knowledgeable, professional, compliant, uncomplaining, and witty with a slight attitude. At all costs, she could not be robotic or insipid, like Microsoft’s “Bob.” Instead, she must always be fresh, unexpected, entertaining and engaging. Siri must consistently “surprise and delight,” such that users become attached to using her. TechCrunch also addressed this question in a truly excellent article by MG Siegler.
As you said, Siri is in beta and only used for internal apps in 15 use cases, e.g. calendaring, music, etc. When do you think they will go prime time and open up to a world of 3rd-party developers?
First, there is an extraordinary distance to go before Siri is ready for 3rd party developers. Apple has much work to do in perfecting the basic Siri experience. With a good Internet connection and proper diction, Siri answers correctly 9 times out of 10. To compete with humans, however, Siri must get to the correct answer 29 times out of 30, or 97% of the time. Interestingly, once machines give consistently correct answers, humans would rather deal with them than other humans, i.e., witness ATM machines.
It takes less energy to deal with a machine than a human to accomplish your purpose. Beyond consistency, there are MANY more tasks that Siri can perform on your behalf. She can manage travels plans and reservations, coordinate local services, find and plan entertainment, purchase e-commerce goods and serve as gateway to 3000 e-commerce service APIs on the Web. Beyond that, Siri can learn your preferences and transactional information, so that she can automate any of these transactions without user involvement.
Regarding 3rd parties, this is harder. First, there is the question of who will provide the service: Apple in the iCloud or the 3rd party developer. If 3rd parties are to provide the service, then Apple must license Siri server technology to them. If Apple provides Siri service in the iCloud, a pricing model for 3rd parties must be developed. Someone must manage provisioning, and ramp service up and down, to meet changing demand. Finally, there is the matter of quality control and branding. Siri is built from complicated technology that is difficult to use correctly. 3rd parties must be taught to integrate with it and use it effectively, as well as to debug it when things go wrong. This will not be easy.
There is also the matter of persona and brand. Siri as a character now has a brand meaning to the masses in terms of competence, attitude, persona and engagement. Third parties, very reasonably, will want to create their own characters and personae. Apple will not be happy if they trample or spoil what has been created with Siri.
What is the real threat of Siri to Google?
A million blue links from Google is worth far less than one correct answer from Siri. People don’t really want search engines. Rather, they want “do” engines. They want to get things done. Siri is the precursor to a revolution in search that provides far more intelligence in filtering results. The end goal is a single correct answer. Siri is a major advance in this direction. Perhaps more importantly, Siri doesn’t stop with giving you URLs (a.k.a., blue links).
Instead, Siri takes you all the way through to pending transactions, requiring only your confirmation to complete them. Siri’s better filtering of results and automation of parameter entry for forms and transactions saves enormous human effort. In this way, Siri points to the future of search.
How do you see Siri threatening Google’s advertising and ecommerce dollars?
Ultimately, corporations don’t want impressions; they want customer transactions. Corporations will be happy to skip advertising altogether, if they can go straight to transactions. In the age of “do” engines, Google becomes much less relevant.
I know you don’t have inside information, but what is possible with Siri in terms of future revenue streams?
Again, there are 3000 e-commerce service APIs open and available on the Internet. As a platform, Siri is designed to integrate with any of them. Moreover, Siri is designed to be extensible and readily incorporate new domains of knowledge and expertise. Referring e-commerce transactions on a cost-per-action (CPA) basis is the highest value marketing lead referral on the Internet. In this regard, Siri represents a long-term threat to Google; Siri can disintermediate Google from advertisers altogether.
What is the future of Siri’s technology?
Siri was architected as an extensible platform to which new domains (e.g., e-commerce shopping, personal memory, sports, blogging, news, social networking, etc.) could be added in a matter of weeks. Effectively, the plan was for Siri to become significantly smarter with new product releases adding multiple new domains each quarter. Beyond that, Siri was designed to be extensible by 3rd party developers to add their specific domain expertise to the core domains understood by Siri (e.g., travel, entertainment, restaurants, local services, Twitter, messaging, etc.).
The goal was to make Siri an open platform for Siri developers to build valuable independent businesses. The Apple developer community includes more than 100,000 developers.
Opening Siri to this developer community would create a potential tsunami of new voice enabled applications of all kinds. However, for Apple, it also presents a new and unfamiliar class of problems. For example, is Siri a licensed server software product or an Internet software-as-a-service? If the latter, what is the pricing, and how does Apple manage provisioning, etc.
Likewise, the Siri “executive assistant” personality now has branded characteristics as a positive experience in the minds of users. How does Apple defend this branded experience against those who might vary it, trade upon it or parody it? There are MANY more such questions for Apple to answer before it opens its revolutionary conversational user interface to developers.
When it does, developers by the thousands will shift their energies to the Apple/Siri platform, thereby disadvantaging Android, as well as the flagging RIM and Microsoft platforms.
What has the most monetizable value and solves our biggest problems?
The Siri team thought long and hard about this question. Ultimately, we concluded that a relatively small portion of Internet searches were directed at those transactions that resulted in commercial transactions, e.g., hotel reservations, restaurant reservations, ticket bookings, travel bookings, e-commerce purchases, local services procurement, etc. Deep knowledge of a relatively small number of domains was required to manage these domains.
The long tail of the Internet is epistemologically very hard to comprehend. However, the domains that have commercial value are relatively few in number and not that difficult to understand.
Therefore, when Siri was an independent company, its plan was to map these domains deeply and seamlessly to automate transactions for its users within them. For example, “Buy that Steve Jobs biography book and send it to my dad”; “Send a dozen yellow roses to my wife”; “Book me the usual table for 2 tonight at 8 p.m. at Giovanni’s”; and “Get me 2 box seats for the Giants game on Saturday.”
Then comes the question of what solves our biggest problems. Ultimately, Siri’s value is that of automation and removing “friction” on the Internet. Siri achieves this by: (1) understanding speech input in natural language form, (2) mapping user requests against its knowledge base (i.e., ontological domains) and (3) activating software “agents” to interact with Internet service providers to fulfill user requests.
All this is easier said than done. User problems in using the Internet are amplified on mobile devices because their screens and keyboards are small and cumbersome — and page downloads are slow.
Siri removes these frictions. It eliminates thumb-typing entirely, and it dramatically reduces the number of page downloads. In this sense, Siri finally and truly enables the mobile Internet.