University Of Rochester And Adobe Are Frontrunners In Microsoft COCO Image Captioning Challenge

Computer scientists from the University of Rochester and Adobe have joined hands in a bid to outperform Google, Microsoft, Baidu/UCLA, Stanford University, University of California Berkeley, University of Toronto/Montreal and other star companies and research institutes in the Microsoft COCO Image Captioning Challenge, with the help of a new amalgamated form of computer generated autonomous captioning of images.

The team explained that they have taken help from two proprietary approaches known as “top-down” and “bottom-up” and united them to engineer a novel method that is error free and provides almost an accurate phrase corresponding to the particular image. In the first approach, the image is interpreted and then words are framed accordingly, following a mannered structure. On the other hand, in the second process, it fixes words according to the image objects and then combines them to form a sentence.

computer-captioning
Google caption: “A baby is eating a piece of paper.”
Rochester ATT caption: “A baby with a toothbrush in its mouth."
Different groups who also participated in the challenge have tried to mix these two approaches together, but they mostly emphasized on the most important aspect of an image. However, the winning team decided to ascertain the importance relying on an algorithm’s decision. Computer image captioning greatly depends on two massive research topics of artificial intelligence, namely natural language processing and computer vision. The first subject helps the system to ‘learn’ from a stack of image sets while the latter helps in forming words and compilation.

The team leader of the research project, Professor Jiebo Luo expressed that their research product was trained under different situations and texts. The key goal of their product was related to the most important feature of an image and assigned a word semantically correct according to the meaning of the image. The complete paper was published on 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

Source: Paying attention to words, not just images, leads to better captions : News Center

Replies

You are reading an archived discussion.

Related Posts

If you thought that virtual reality was just about ‘seeing’ what doesn’t exist in reality, then think again. Samsung has unveiled a new headphone that will also enable you to...
Samsung S6 and S6 Edge phones are finally getting the much awaited Android 6 Marshmallow firmware upgrade in India. All the owners were getting frustrated at the slow rollout of...
Hi, I completed my engineering in Electrical and Engineering . I have started applying for jobs,I am all focused about getting a career in core industries . Can you please...
Project Abstract / Summary : In recent years, power crisis plays important role in the development of country. The power can be generated in many ways such as wind, thermal,...
Just received a very authentic looking email from google@google.accounts.com - and the email says - Hi XXXX Your mobile application is constantly sending shutdown request to us about your account....