Alexa will soon be able to recall information you’ve directed her to remember, as well as have more natural conversations that don’t require every command to begin with “Alexa.” She’ll also be able to launch skills in response to questions you ask, without explicit instructions to do so. The features are the first of what Amazon says are many launches this year that will make its virtual assistant more personalized, smarter, and more engaging.
The news was announced this morning in a keynote presentation from the head of the Alexa Brain group, Ruhi Sarikaya, speaking at the World Wide Web Conference in Lyon, France.
He explained that the Alexa Brain initiative is focused on improving Alexa’s ability to track context and memory within and across dialog sessions, as well as make it easier for users to discover and interact with Alexa’s now over 40,000 third-party skills.
With the memory update, arriving soon to U.S. users, Alexa will be able to remember any information you ask her to, and retrieve it later.
For example, you might direct Alexa to remember an important day by saying something like, “Alexa, remember that Sean’s birthday is June 20th.” Alexa will then reply, “Okay, I’ll remember that Sean’s birthday is June 20th.” This effectively turns Alexa into a way to offload information you’d otherwise have to store in your own brain, and is reminiscent of earlier bots, like Wonder, which were designed to remember anything you told it, for later retrieval over SMS or messaging platforms.
Memory, of course, has also been one of Google Assistant’s more useful features – so it was time for Alexa to catch up on this front.
In addition, Alexa will soon be able to have more natural conversations with users, thanks to something called “context carryover.” This means that Alexa will be able to understand follow-up questions and respond appropriately, even though you haven’t addressed her as “Alexa.”
For instance, you could ask “Alexa, how is the weather in Seattle?” and then ask, “What about this weekend?” after Alexa responds.
You can even change the subject, saying “Alexa, how’s the weather in Portland?,” then “How long does it take to get there?”
The feature, says Sarikaya, takes advantage of deep learning models applied to the spoken language understanding pipeline, in order to have conversations that carry customers’ intent and entities within and across domains – like it did between weather and traffic, in the example above. It will also require the customer to enable Follow Up mode, which allows Alexa to continue a conversation even when the wake word isn’t said a second time.
Natural conversations are also coming “soon” to Alexa device owners in the U.S., U.K. and Germany.
A third advance arriving in the near future focuses on Alexa’s skills. These are the third-party voice apps that aim to help you do more with Alexa – like checking your credit card account information, playing news radio, ordering an Uber, playing a game, and more. There are so many out there, it’s becoming harder to surface them just by digging around in the Alexa Skills Store.
In the weeks ahead, U.S. users will be able to launch skills using natural phrases, instead of explicit commands like “Alexa, open [skill name]” or “…enable [skill name].”
Amazon has been working to make Alexa’s skills easier to use for years. In 2016, Echo was updated to allow users to enable new Alexa skills by voice, and last year, Alexa began suggesting skills in response to certain questions in limited scenarios. With the new feature, now in beta testing, Alexa will instead locate and launch skills for you.
Sarikaya gives an example of this from the current beta test, noting that he asked Alexa “how do I remove an oil stain from my shirt?”
Alexa responded by saying “Here is Tide Stain Remover,” which is the name of Procter & Gamble’s skill that walks you through stain removal for over 200 specific stain types – including oil.
Before, it was hard to imagine why anyone would seek out and enable a Tide skill on their own, but having it in Alexa’s repertoire now begins to make more sense.
This could also potentially present Amazon with an advertising model, similar to Google’s keyword bidding system. If someone asks for information that could be answered by a skill touting a particular product or brand, Amazon could eventually have advertisers compete to be the skill recommended first. (Perhaps the others could be called up with a follow-up request, “any other ideas?”)
Amazon isn’t giving an exact launch date for any of these three new features, only that they’re coming soon.
But despite the new launches, Sarikaya notes there’s still a lot of work left ahead.
“We have many challenges still to address, such as how to scale these new experiences across languages and different devices, how to scale skill arbitration across the tens of thousands of Alexa skills, and how to measure experience quality,” he says. “Additionally, there are component-level technology challenges that span automatic speech recognition, spoken language understanding, dialog management, natural language generation, text-to-speech synthesis, and personalization,” he says.
“Skills arbitration, context carryover and the memory feature are early instances of a class of work Amazon scientists and engineers are doing to make engaging with Alexa more friction-free,” Sarikaya continues. “We’re on a multi-year journey to fundamentally change human-computer interaction, and as we like to say at Amazon, it’s still Day 1.”