Microsoft VALL-E Clones Your Voice in 3 Seconds

Kaustubh Katdare • 1 year ago • 8.7k views

Microsoft VALL-E Clones Your Voice in 3 Seconds

AI's turned into a mimicry artist. Microsoft's VALL-E, a neural code language model can learn to clone any voice in just 3 seconds. The AI model can work on a small audio clip of the target speaker and train itself to synthesize high-quality, personalised speech.

Microsoft engineers have trained VALL-E on 60K hours of data, which is 100x larger than any existing system used for text to speech synthesis (TTS).

The research paper indicates that VALL-E can preserve the naturalness and emotions and acoustic environment of the target speaker.

Replies

Note: Only logged-in members of CrazyEngineers can add replies.

Recent updates

China is planning to build 150 new nuclear reactors in the next 15 years with $440 billion investment. By 2025, China will surpass USA as the largest producer of nuclear...

Japanese engineers have achieved 319Tbps data transfer speed using optic-fibers; paving the way for 6G. A 4-core optic-fiber cable was used to fire 552-channel comb laser at multiple wavelengths using...

New 5D storage mechanism now allows storing 500TB of data on a CD-sized glass disc. Researchers used near-field enhancement which allows creation of nanostructures using weak light pulses to store...

Alphabet, the owners of Google believe that the industrial robots need better software to make them easier to use, flexible and cost-effective. Wendy Tan-White (CEO, Intrinsic) wants the modern production...

Global retail and cloud giant Amazon is likely to layoff 10000 employees this week according to NYT. Meta recently laid off 11000 employees and Twitter fired almost 50% of its...