Microsoft VALL-E Clones Your Voice in 3 Seconds

By Kaustubh Katdareon Thu, Feb 16, 2023 9:07 AM (IST)
AI's turned into a mimicry artist. Microsoft's VALL-E, a...

AI's turned into a mimicry artist. Microsoft's VALL-E, a neural code language model can learn to clone any voice in just 3 seconds. The AI model can work on a small audio clip of the target speaker and train itself to synthesize high-quality, personalised speech.

Microsoft engineers have trained VALL-E on 60K hours of data, which is 100x larger than any existing system used for text to speech synthesis (TTS).

The research paper indicates that VALL-E can preserve the naturalness and emotions and acoustic environment of the target speaker.

About Author:
Kaustubh Katdare
Kaustubh Katdare is an avid technology blogger and a tech-enthusiast.