How many pixels make an image?

ANTONIO TORRALBA

doi:10.1017/S0952523808080930

How many pixels make an image?

Published online by Cambridge University Press: 01 January 2009

ANTONIO TORRALBA

Show author details

ANTONIO TORRALBA*: Affiliation:
Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts
*: *Address correspondence and reprint requests to: Antonio Torralba, Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, 32-D432, 32 Vassar Street, Cambridge, MA 02139. E-mail: torralba@csail.mit.edu

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

The human visual system is remarkably tolerant to degradation in image resolution: human performance in scene categorization remains high no matter whether low-resolution images or multimegapixel images are used. This observation raises the question of how many pixels are required to form a meaningful representation of an image and identify the objects it contains. In this article, we show that very small thumbnail images at the spatial resolution of 32 × 32 color pixels provide enough information to identify the semantic category of real-world scenes. Most strikingly, this low resolution permits observers to report, with 80% accuracy, four to five of the objects that the scene contains, despite the fact that some of these objects are unrecognizable in isolation. The robustness of the information available at very low resolution for describing semantic content of natural images could be an important asset to explain the speed and efficiently at which the human brain comprehends the gist of visual scenes.

Keywords

Scene recognition Object segmentation Gist Thumbnails Natural images Blobs

Type: Natural Scene Statistics and Natural Tasks
Information: Visual Neuroscience , Volume 26 , Issue 1 , January 2009 , pp. 123 - 131

DOI: https://doi.org/10.1017/S0952523808080930 [Opens in a new window]
Copyright: Copyright © Cambridge University Press 2009

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Bachmann, T. (1991). Identification of spatially quantized tachistoscopic images of faces: How many pixels does it take to carry identity? European Journal of Cognitive Psychology 3, 85–103.CrossRef Google Scholar

Bar, M. (2004). Visual objects in context. Nature Neuroscience Reviews 5, 617–629.CrossRef Google Scholar PubMed

Bar, M. (2007). The proactive brain: Using analogies and associations to generate predictions. Trends in Cognitive Sciences 11, 280–289.CrossRef Google Scholar PubMed

Castelhano, M.S. & Henderson, J.M. (2008). The influence of color on perception of scene gist. Journal of Experimental Psychology: Human Perception and Performance 34, 660–675.Google Scholar PubMed

Chandler, D.M. & Field, D.J. (2006). Estimates of the information content and dimensionality of natural scenes from proximity distributions. Journal of the Optical Society of America. A, Optics, Image Science, and Vision 24, 922–941.CrossRef Google Scholar

Fei-Fei, L., Iyer, A., Koch, C. & Perona, P. (2007). What do we perceive in a glance of a real-world scene? Journal of Vision 7(1), 1–29.CrossRef Google Scholar

Friedman, A. (1979). Framing pictures: The role of knowledge in automatized encoding and memory of gist. Journal of Experimental Psychology: General 108, 316–355.CrossRef Google Scholar PubMed

Goffaux, V., Jacques, C., Mouraux, A., Oliva, A., Rossion, B. & Schyns, P.G. (2005). Diagnostic colors contribute to early stages of scene categorization: Behavioral and neurophysiological evidences. Visual Cognition 12, 878–892.CrossRef Google Scholar

Greene, M.R. & Oliva, A. (2009). Recognition of natural scenes from global properties: Seeing the forest without representing the trees. Cognitive Psychology 58(2), 137–179.CrossRef Google Scholar PubMed

Harmon, L.D. & Julesz, B. (1973). Masking in visual recognition: Effects of two-dimensional filtered noise. Science 180, 1194–1197.CrossRef Google Scholar PubMed

Intraub, H. (1981). Rapid conceptual identification of sequentially presented pictures. Journal of Experimental Psychology: Human Perception and Performance 7, 604–610.Google Scholar

Joubert, O., Rousselet, G., Fize, D. & Fabre-Thorpe, M. (2007). Processing scene context: Fast categorization and object interference. Vision Research 47, 3286–3297.CrossRef Google Scholar PubMed

Klein, S.A. (2001). Measuring, estimating, and understanding the psychometric function: A commentary. Perception & Psychophysics 63, 1421–1455.CrossRef Google Scholar PubMed

Lee, A.B., Pedersen, K.S. & Mumford, D. (2003). The nonlinear statistics of high-contrast patches in natural images. International Journal of Computer Vision 54(1–3), 83–103.CrossRef Google Scholar

Oliva, A. & Schyns, P.G. (1997). Coarse blobs or fine edges? Evidence that information diagnosticity changes the perception of complex visual stimuli. Cognitive Psychology 34, 72–107.CrossRef Google Scholar PubMed

Oliva, A. (2005). Gist of the scene. In The Encyclopedia of Neurobiology of Attention, ed. Itti, L., Rees, G. & Tsotsos, J.K., pp. 251–256. San Diego, CA: Elsevier.CrossRef Google Scholar

Oliva, A. & Schyns, P.G. (2000). Diagnostic colors mediate scene recognition. Cognitive Psychology 41, 176–210.CrossRef Google Scholar PubMed

Oliva, A. & Torralba, A. (2001). Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision 42(3), 145–175.CrossRef Google Scholar

Oliva, A. & Torralba, A. (2007). The role of context in object recognition. Trends in Cognitive Sciences 11, 520–527.CrossRef Google Scholar PubMed

Olshausen, B.A., Anderson, C.H. & Van Essen, D.C. (1993). A neurobiological model of visual attention and invariant pattern recognition based on dynamic routing of information. Journal of Neuroscience 13, 4700–4719.CrossRef Google Scholar PubMed

Olshausen, B.A. & Field, D.J. (1996). Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381, 607–609.CrossRef Google Scholar PubMed

Potter, M.C. (1975). Meaning in visual scenes. Science 187, 965–966.CrossRef Google Scholar

Renninger, L.W. & Malik, J. (2004). When is scene recognition just texture recognition? Vision Research 44, 2301–2311.CrossRef Google Scholar PubMed

Rousselet, G.A., Joubert, O.R. & Fabre-Thorpe, M. (2005). How long to get to the “gist” of real-world natural scenes? Visual Cognition 12, 852–877.CrossRef Google Scholar

Russell, B., Torralba, A., Murphy, K. & Freeman, W.T. (2008). LabelMe: A database and web-based tool for image annotation. International Journal of Computer Vision 77(3), 157–173.CrossRef Google Scholar

Schyns, P.G. & Oliva, A. (1994). From blobs to boundary edges: Evidence for time- and spatial-scale-dependent scene recognition. Psychological Science 5, 195–200.CrossRef Google Scholar

Schyns, P.G. & Oliva, A. (1997). Flexible, diagnostically-driven, rather than fixed, perceptually determined scale selection in scene and face recognition. Perception 26, 1027–1038.CrossRef Google Scholar

Serre, T., Oliva, A. & Poggio, T.A. (2007). A feedforward architecture accounts for rapid categorization. Proceedings of the National Academy of Sciences 104, 6424–6429.CrossRef Google Scholar PubMed

Sinha, P., Balas, B.J., Ostrovsky, Y. & Russell, R. (2006). Face recognition by humans: 19 results all computer vision researchers should know about. Proceedings of the IEEE 94 (No. 11), 1948–1962.CrossRef Google Scholar

Thorpe, S., Fize, D. & Marlot, C. (1996). Speed of processing in the human visual system. Nature 381, 520–522.CrossRef Google Scholar PubMed

Torralba, A., Fergus, R. & Freeman, W.T. (2008). 80 million tiny images: A large dataset for non-parametric object and scene recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 30(11), 1958–1970.CrossRef Google Scholar PubMed

Torralba, A., Oliva, A., Castelhano, M. & Henderson, J.M. (2006). Contextual guidance of attention in natural scenes: The role of global features on object search. Psychological Review 113, 766–786.CrossRef Google Scholar PubMed

VanRullen, R. & Thorpe, S.J. (2001 a). Rate coding versus temporal order coding: What the retinal ganglion cells tell the visual cortex. Neural Computation 13, 1255–1283.Google Scholar

VanRullen, R. & Thorpe, S.J. (2001 b). The time course of visual processing: From early perception to decision making. Journal of Cognitive Neuroscience 13, 454–461.CrossRef Google Scholar PubMed

Wolfe, J.M. (1998). Visual memory: What do you know about what you saw? Current Biology 8, R303–R304.CrossRef Google Scholar

Wurm, L.H., Legge, G.E., Isenberg, L.M. & Luebker, A. (1993). Color improves object recognition in normal and low vision. Journal of Experimental Psychology: Human Perception and Performance 19, 899–911.Google Scholar PubMed

Article contents

How many pixels make an image?

Abstract

Keywords

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests