Google’s AI can now perfectly imitate a human voice

Arguably the most important part of a virtual assistant is its voice, especially if you want it to catch on in the mainstream. The most intelligent AI in the world won’t sit well with the average user if it sounds like a chainsaw robot when it answers questions, and that’s why we’ve seen big companies focus so much on making their sidekicks sound natural. 

Google has been at the forefront of that thanks to their intense neural network technology, and it’s paid off with a new system from the search giant called Tacotron 2. That may be “tack-o-tron” and linguistically mean something specifically related to speech, but I’m going to assume Google’s naming stuff after tacos here.

The Tacotron 2 is a text-to-speech system that relies on two neural networks; the first network translates text into a spectrogram, and the second network turns the spectrogram into audio playback. It’s a complex two-part system that combines tech from both Google and Alphabet to pull off.

And, believe it or not, the AI is virtually indistinguishable from a human voice. There are a few audio samples that compare both, and even some phrases that change the inflection and important parts of a sentence based on capitalization and punctuation, and it’s just about perfect.

Of course, technology like this will keep getting better over time, but that doesn’t take away from how great Google is making things right now.

source: Google
via: Quartz


About the Author: Jared Peters

Born in southern Alabama, Jared spends his working time selling phones and his spare time writing about them. The Android enthusiasm started with the original Motorola Droid, but the tech enthusiasm currently covers just about everything. He likes PC gaming, Lenovo's Moto Z line, and a good productivity app.