Lip-reading AI system from Oxford shows humans how to do it better

Have you ever tried to guess what a person on TV is saying, while the volume is muted? It's hard! Well, a new artificially intelligent (AI) system developed by Oxford scientists can do it way better than competing humans. While the expert humans could guess only 12% of the words correctly, the Oxford system got nearly 50% of the words right. Of course, the scientists trained the AI system by letting it watch thousands of hours of BBC news footage.

This newly developed AI system is called "Watch, Attend and Spell" aka WAS and it was developed in association with Google's #-Link-Snipped-# division. Oxford Ph.D student Joon Soon Chung informed about the complexity of the challenge. Consider most frequently used words like 'mat', 'cat', 'bat and 'pat'. Humans make similar mouth shapes for these words and it's extremely difficult to guess what word is being said.

lip-reading-AI-bbc-oxford

In order to address this challenge, the scientists made the system understand what words usually came together. The system learns from figuring out the series of mouth-shapes and makes smart guesses about the words that will follow. BBC provided their content from popular news programs along with subtitles. A neural network was then programmed to use this visual data to learn what words are being said and what mouth shapes were associated.

An enhanced lip-reading technology will allow for better accuracy and speed in speech-to-text service. Perhaps, it would lead to real time speech to text where the system observes the mouth and uses them to create sub-titles. Existing speech to text systems are not as effective in noisy environments. The lip-reading system would solve that problem very effectively.

Lip-reading is very hard. According to Oxford computer science researchers, hearing-impaired lip readers could achieve about 52% accuracy while the Georgia Tech researchers found out that only 30% of the speech can be seen on the lips. With deep-learning systems that keep improving themselves, the problem can be solved to a greater extent.

Refer to the original research paper published by Oxford researchers for details of the study: "#-Link-Snipped-#"

Source: Towards a lip-reading computer - BBC News

Replies

You are reading an archived discussion.

Related Posts

India's top telecom service providers - Idea cellular and Vodafone have announced merger. It's no brainer that the two companies were taking steps to survive in the markets from the...
I'm wondering what average speeds fellow Reliance Jio 4G users are getting. Is it slow or fast in your city. I've tested the download and upload speeds in various areas...
Engineers are off-axis, mad or plain Crazy: The Life of an Engineer in 17 Memes
In 2015, Intel announced a SDD which would be 10 times faster than DRAM and 1000 better than NAND. After a wait of 2 years, company has finally launched its...
The Government of India's 'Digital India' initiative has put the AADHAR number at the center of most of the transactions. The AADHAR card is rapidly becoming the most preferred way...