In a very short and bizarre demo, Amazon showed how Alexa can mimic a dead relative’s voice to read bedtime stories or accomplish other tasks that involve “human empathy.” The feature is still experimental, but according to Amazon, Alexa only needs a few minutes of audio to impersonate someone’s voice.
The demo came in the middle of Amazon’s annual re:MARS conference, an industry gathering that focuses on machine learning, space exploration, and a few other exciting things. In it, a little boy asks Alexa if Grandma can read The Wizard of Oz—The speaker responds accordingly using a synthesized voice.
“Instead of the voice of Alexa reading the book, it’s the voice of the child’s grandmother,” Rohit Prasad, Amazon’s chief scientist for Alexa AI, told a quiet crowd after the demo.
Prasad notes that “many of us have lost someone we love” to the pandemic and claims that AI speech synthesis can “make their memories live on”. This is obviously a controversial idea: it’s morally questionable, we don’t know how it could affect mental health, and we’re not sure how far Amazon wants to push the technology. (I mean, can I use a dead relative’s voice for GPS navigation? What’s the point here?)
Amazon’s advanced speech synthesis technology is also worrying. Previously, Amazon duplicated the voices of celebrities like Shaquille O’Neal using several hours of professionally recorded content. But the company now claims that it can copy a voice with just a few minutes of audio. We’ve already seen how speech synthesis technology can aid in fraud and theft, so what happens next?
We don’t know if Amazon will ever introduce this speech synthesis feature on its smart speakers. But audio deepfakes are basically unavoidable. They are already a big part of the entertainment industry (see Top Gun: Maverick for example), and Amazon is just one of many companies trying to clone voices.
Source: Amazon via The Verge