Image description with a goal: Building efficient discriminating expressions for images

Amir Sadovnik, Yi I. Chiu, Noah Snavely, Shimon Edelman, Tsuhan Chen

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

9 Scopus citations

Abstract

Many works in computer vision attempt to solve different tasks such as object detection, scene recognition or attribute detection, either separately or as a joint problem. In recent years, there has been a growing interest in combining the results from these different tasks in order to provide a textual description of the scene. However, when describing a scene, there are many items that can be mentioned. If we include all the objects, relationships, and attributes that exist in the image, the description would be extremely long and not convey a true understanding of the image. We present a novel approach to ranking the importance of the items to be described. Specifically, we focus on the task of discriminating one image from a group of others. We investigate the factors that contribute to the most efficient description that achieves this task. We also provide a quantitative method to measure the description quality for this specific task using data from human subjects and show that our method achieves better results than baseline methods.

Original languageEnglish
Title of host publication2012 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2012
Pages2791-2798
Number of pages8
DOIs
StatePublished - 2012
Externally publishedYes
Event2012 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2012 - Providence, RI, United States
Duration: Jun 16 2012Jun 21 2012

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
ISSN (Print)1063-6919

Conference

Conference2012 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2012
Country/TerritoryUnited States
CityProvidence, RI
Period06/16/1206/21/12

Fingerprint

Dive into the research topics of 'Image description with a goal: Building efficient discriminating expressions for images'. Together they form a unique fingerprint.

Cite this