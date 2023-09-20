In addition to getting a generative AI-powered upgrade, and the ability to continue conversations without again using the wakeword “Alexa,” Amazon’s voice assistant is going to gain a more natural-sounding voice. The company introduced today an updated text-to-speech engine that’s now more context-aware of the user’s emotions and the tone of your voice, which then allows Alexa to respond with a similar emotional variation in its ouptut.

The company demoed the new voice which offered a less robotic-sounding Alexa which included more expressiveness — something the company noted was powered by large transformers that were trained on different languages and accents.

For example, if a customer asked for an update about their favorite sports team and they had won the latest game, Alexa would be able to respond with a joyful voice. If they had lost, however, Alexa would sound more empathetic.

“And we’re working on a new model—which we refer to as speech-to-speech—again powered by massive transformers. Instead of first converting a customer’s audio request into text using speech recognition, and then using an LLM to generate a text response or an action, and then text-to-speech to produce audio back—this new model will unify these tasks, creating a much richer conversational experience,” said SVP of Alexa Rohit Prasad.

Amazon said Alexa will be able to exhibit attributes like laughter, surprise, and even uh-huhs that encourage users to continue the conversation.