Real Time Vocal Translator By Microsoft

Kaustubh Katdare

Kaustubh Katdare

@thebigk Oct 21, 2024
I've often envisioned a place where every individual gets to talk in his/her own mother-tongue and there's no problem in communication. Microsoft, at TechFest 2012 has demonstrated a similar technology and has called it 'Photo Real Talking Head'. Read more about the technology here: #-Link-Snipped-#

Looking forward to ideas on how'd the algorithms for such translators would work 😀

Replies

Welcome, guest

Join CrazyEngineers to reply, ask questions, and participate in conversations.

CrazyEngineers powered by Jatra Community Platform

  • Ankita Katdare

    Ankita Katdare

    @abrakadabra Mar 13, 2012

    They had demonstrated the similar thing at Techfest 2011.
    First, they applied a 2-D-to-3-D reconstruction algorithm frame by frame on a 2-D video to construct a 3-D training database.



    As per the description in this video -

    In training, super-feature vectors consisting of 3-D geometry, texture, and speech are formed to train a statistical, multistreamed, Hidden Markov Model (HMM). The HMM then is used to synthesize both the trajectories of geometric animation and dynamic texture. The 3-D talking head can be animated by the geometric trajectory, while the facial expressions and articulator movements are rendered with dynamic texture sequences. Head motions and facial expression also can be separately controlled by manipulating corresponding parameters. The new 3-D talking head has many useful applications, such as voice agents, telepresence, gaming, and speech-to-speech translation.