Amazon already gave Alexa the ability to whisper, and now it’s rolling out another way to change the assistant’s speaking style — it’s giving Alexa a “newscaster” voice. Starting today, when U.S. customers ask Alexa “what’s the latest?” to hear the day’s news, Alexa will respond using a voice that’s similar to how a professional newscaster delivers news.
The voice knows which words should be emphasized for a more realistic delivery of the news, explains Amazon.
To achieve this new voice, Amazon took advantage of recent developments it made with Neural TTS technology, or NTTS. This technology delivers a more natural-sounding voice, and allows Alexa to adapt her speaking style based on the context of your request. For the newscaster voice, NTTS produced speech with better intonation that emphasizes the right words in a sentence, Amazon says.
In addition, Amazon scientists used an approach called direct waveform modeling that applies deep learning to produce the speech signal.
The company had detailed this technology in November, saying at the time its latest text-to-speech system could be trained to use the newscaster style after just a few hours of training data. The development could pave the way for Alexa and other services to adopt different speaking styles for other contexts in the future, the researchers noted.
“The ability to teach Alexa to adapt her speaking style based on the context of the customer’s request opens the possibility to deliver new and delightful experiences that were previously unthinkable,” said Andrew Breen, senior manager with the TTS Research team at Amazon, in a statement. “We’re thrilled that our customers will get to listen to news and Wikipedia information from Alexa in this new way.”
Below is an audio sample of the previous technology, followed by one of the new newscaster voice:
The company also showed off how NTTS technology could allow Alexa to employ a neutral voice when reading Wikipedia information: