Facebook today acquired Wit.ai, a Y Combinator startup founded 18 months ago to create an API for building voice-activated interfaces. Wit.ai already has 6,000 developers on its platform who have built hundreds of apps.
Wit.ai’s platform will remain open and free, which makes it seem that Facebook wants to use the technology to draw developers into its Build-Grow-Monetize loop where they get help building apps, but eventually pay Facebook for ads to grow or monetize by splitting revenue with Facebook from hosting its ads.
As part of Facebook, Wit.ai could help the company offer voice control development tools alongside its Parse development platform, aid with voice-to-text input for Messenger, improve Facebook’s understanding of the semantic meaning of voice, and create a Facebook app you can navigate through speech.
“Wit.ai has built an incredible yet simple natural language processing API that has helped developers turn speech and text into actionable data,” Facebook tells me. “We’re excited to have them onboard.”
The Stripe Of Voice Command APIs
The Wit.ai product lets developers add a few lines of its code to instantly build in speech recognition and voice control. Without it, developers would need the expertise, time and resources to build a whole voice-recognition system themselves.
“Facebook has the resources and talent to help us take the next step,” Wit.ai wrote in a blog post about the acquisition. “Facebook’s mission is to connect everyone and build amazing experiences for the over 1.3 billion people on the platform – technology that understands natural language is a big part of that, and we think we can help.”
Wit.ai’s co-founder Alexandre Lebrun previously sold his “Siri for enterprise” voice command virtual assistant startup VirtuOz to Nuance. He then brought Wit.ai through Y Combinator in Winter 2014 class, where TechCrunch named it one of the top 8 startups from its class. Wit.ai went on to raise a $3 million seed round in October from Andreessen Horowitz, New Enterprise Associates and SV Angel.
You can get a feel for the capabilities of its voice interface API from this video:
Earlier this year, Lebrun told us the idea was to bring the “Twilio or Stripe model” for telecom services or payments and apply it to creating a voice interface API.
Speech-To-Text For Messenger
Facebook has also been staffing up its Language Technology Group that I hear may be working on voice-to-text input for Messenger. Imagine being able to speak hands-free to Messenger, it transcribes your speech to text, and then lets you command it to send the message.
Google and Apple both have their own voice command systems, like Siri, that you can use to transcribe text in iOS apps. But both are designed for humans talking to machines through rote voice commands for actions like search. Facebook may focus instead on being able to interpret the inflection and colloquialisms that humans use when talking to each other.
Wearables are a fast-growing trend, and most have screens too small to type on. That means voice command is poised to become a lot more important for third-party apps over the next few years.
Facebook has had huge success jumping on the mobile app development craze with its acquisition of Parse. That mobile-backend-as-a-service has grown from supporting 60,000 apps in April 2013 when Facebook bought it to over 500,000 now.
Facebook may now hope to give Wit.ai help with engineering, recruiting and distribution as it did with Parse. With the social network’s help, Wit.ai could blossom into the premier way that third-party developers add voice control to their apps.