After ChatGPT and DALL-E, meet VALL-E, the text-to-speech AI that can mimic anyone’s voice

After ChatGPT and DALL-E, meet VALL-E, the text-to-speech AI that can mimic anyone’s voice

Thank you for reading this post, don't forget to subscribe!

Last year saw the emergence of artificial intelligence (AI) tools that can create images, artwork or even video with a text prompt.

There were also big steps forward AI writingwith OpenAI’s ChatGPT causing universal excitement – and fear – about the future of writing.

Now, just days after the start of 2023, another powerful AI use case has come into the spotlight – a text-to-speech tool that can flawlessly mimic a human voice.

Developed by Microsoft, VALL-E can take a three-second recording of someone’s voice and play that voice back, turning written words into speech with realistic intonation and emotion depending on the context of the text.

Trained with 60,000 hours worth of English speech recordings, he can deliver a speech in “situation zero”, meaning without prior examples or training in a specific context or situation.

Introducing VALL-E in paper published by Cornell Universitythe developers explained that the recording data consisted of more than 7,000 unique speakers.

The team says their Text To Speech (TTS) system used hundreds of times more data than existing TTS systems, helping them overcome the zero-hit problem.

The tool is not currently available for public use, but it raises questions about safety, given that it could potentially be used to generate text coming from someone’s voice.

Microsoft is betting big on AI

However, its creators have provided a demonstrationshowing several three-second speaker prompts and a demonstration of text-to-speech in action with the voice mimicked correctly.

Along with the speaker prompt and VALL-E output, you can compare the results to the “ground truth”—the actual speaker reading the prompt—and the “baseline” output of the current TTS technology.

Microsoft is investing heavily in AI and is one of the backers of OpenAI, the company behind ChatGPT and DALL-E, a text-to-image or art tool.

The software giant invested $1 billion (€930 million) in OpenAI in 2019, and a report this week on said it plans to invest another $10 billion (€9.3 billion) in the company.

#ChatGPT #DALLE #meet #VALLE #texttospeech #mimic #anyones #voice

Related Articles

Check Also
Back to top button