Amazon Is Making Alexa More Emotional So Users Will Have A More Engaging Experience

Once again, I have to remind everybody that it’s 2019 and it’s the year that anything is possible.

In just one more month, the year will be 2020 and the word impossible will be removed from the dictionary!

Just last week, Amazon announced that they will be giving Alexa emotions.

That’s right people. The usually boring-sounding Alexa will now sound more like humans.

They claim this only affects the tone to be more natural sounding, but with Jeff Bezos looking like Lex Luthor and Mark Zuckerberg being either a reptilian alien being or an android, I think there’s more beyond the surface.

Here’s an example of Alexa usage if you haven’t heard of her (it?) before:

If anything, this is Amazon issuing a warning. Treat your robots well now so when the robot uprising happens they might treat you the same.

Or it has already begun…? / Image: Twitter (Expat Med)

Can Respond In Happy Or Disappointed Tones At Various Levels Of Intensity

There’s really nothing I can write here that’s better than listening, so here:

For comparison, here’s the neutral voice:

Amazon suggested that this can apply in examples like when you answer a trivia question correctly or win a game. Or maybe when you ask for a sports score and Alexa can react based on your favourite team accordingly.

This also means that sometime in the future, Alexa will probably sound permanently disappointed whenever I talk to her (it?) since I always ask stupid questions.

You know, questions like “Alexa, are you single?”

Different Speaking Styles

For example, Alexa now can speak in news and music styles. The differences appear to be in changing aspects of the speech like intonation, emphasis on words and timing of pauses.

Comparison to standard:

And if you’re wondering, they even have an Australian accent now.

Works Using Neural Text-to-Speech (NTSS) Technology

The entire thing is too long, and if you’re a developer or some kind, you’re probably better off reading the blog which also links to two papers.

Essentially what this thing does is they take a lot of voice samples (in one of those papers, they used 2,000 utterances each from 52 female and 22 male speakers, in 17 languages for training data). Then, using that data they create a voice.

Amazon isn’t the only one using this technology of course. Microsoft, for instance, is also using the tech.

This means that in the future, I’ll be looking forward to my AI assistants sounding like strict secretaries with a soft spot. Preferably ones who sound like they wear spectacles and wear high knee socks. I’ll create my own using voice samples from P****u* if I have to!

That sounds… oddly specific. What do spectacles and high knee socks even sound like? And what is this for? What are you doing to Alexa?!

…I guess Her (2013 film) is a documentary of the future.