Google's talking AI is indistinguishable from humans
by Mark Frauenfelder from on (#3KTWA)

Tacotron 2 is Google's new text-to-speech system, and as heard in the samples below, it sounds indistinguishable from humans.
From Quartz:
The system is Google's second official generation of the technology, which consists of two deep neural networks. The first network translates the text into a spectrogram (pdf), a visual way to represent audio frequencies over time. That spectrogram is then fed into WaveNet, a system from Alphabet's AI research lab DeepMind, which reads the chart and generates the corresponding audio elements accordingly.Tacotron 2 or Human? In the following examples, one is generated by Tacotron 2, and one is the recording of a human, but which is which?
| "That girl did a video about Star Wars lipstick." | |
| 1 | |
| 2 | |
| "She earned a doctorate in sociology at Columbia University." | |
| 1 | |
| 2 | |
| "George Washington was the first President of the United States." | |
| 1 | |
| 2 | |
| "I'm too busy for romance." | |
| 1 | |
| 2 |
Soundwave image by T-flex/Shutterstock.