Course page

VS298: Unsolved Problems in Vision

VS298: Unsolved Problems in Vision

One of the goals of vision science is to understand the nature of perception and its neural substrates. There are now many well established techniques and paradigms in both psychophysics and neuroscience to address problems in vision. However, knowing how to frame these questions for investigation is not necessarily obvious. Nervous systems present us with stunning complexity, and the purpose of perception itself is deeply mysterious. The goal of this seminar course is to step back and ask, what are the important problems that remain unsolved in vision research, and how should these be approached empirically? The course will consist of alternating weeks of discussion and guest lectures by vision scientists who will frame their views of the core unsolved problems. Interdisciplinary groups of students will devise a practical research plan to address an unsolved problem of their choice.

Instructors: Stan Klein, Jerry Feldman, Bruno Olshausen, and Karl Zipser
GSI: Dan Coates

Enrollment information:

VS 298 (section 2), 2 units
CCN: 66478

Meeting time and place: Tuesday 6-8, 560 Evans (Redwood Center conference room)

Email list:

  • Seminar mailing list subscribe
  • Lecture series mailing list subscribe


Weekly schedule:

Date Topic/Reading
Sept. 2 Introduction

Sept. 9 Methodology in vision science (Stan Klein)

  • Double-judgment psychophysics: problems and solutions pdf Read pp 1560-1567 This will give a glimpse into some of the issues involved with the relationship between detecting and identifying an object. The second part of the paper is more complicated.
  • Measuring, estimating, and understanding the psychometric function: A commentary pdf I (Stan Klein) was an editor of a special issue of “Perception & Psychophyics” and I wrote the summary article not only commenting on a number of the articles, but also trying to clarify some misunderstood aspects in the field.
  • Psychophysics : A Practical Introduction site This is the text by Kingdom and Prins that I’ve used when teaching psychophysics methods. I suggest reading Chapters 2 & 3. Some of the dichotomies in Chapter 2 are directly relevant to a number of unsolved problems in vision. Some might even be insoluble.

Marcus background discussion

Sept. 16 Evening seminar, focus on student projects: form groups and discuss proposal topics grant_format
Sept. 19
(Friday) 11:00 a.m., 5101 Tolman
Gary Marcus lecture: Computational diversity and the mesoscale organization of the neocortex video
The human neocortex participates in a wide range of tasks, yet superficially appears to adhere to a relatively uniform six-layered architecture throughout its extent. For that reason, much research has been devoted to characterizing a single “canonical” cortical computation, repeated massively throughout the cortex, with differences between areas presumed to arise from their inputs and outputs rather than from “intrinsic” properties. There is as yet no consensus, however, about what such a canonical computation might be, little evidence that uniform systems can capture abstract and symbolic computation (e.g., language) and little contact between proposals for a single canonical circuit and complexities such as differential gene expression across cortex, or the diversity of neurons and synapse types. Here, we evaluate and synthesize diverse evidence for a different way of thinking about neocortical architecture, which we believe to be more compatible with evolutionary and developmental biology, as well as with the inherent diversity of cortical functions. In this conception, the cortex is composed of an array of reconfigurable computational blocks, each capable of performing a variety of distinct operations, and possibly evolved through duplication and divergence. The computation performed by each block depends on its internal configuration. Area-specific specialization arises as a function of differing configurations of the local logic blocks, area-specific long-range axonal projection patterns and area-specific properties of the input. This view provides a possible framework for integrating detailed knowledge of cortical microcircuitry with computational characterizations. With Adam Marblestone, MIT and Tom Dean, Google
Sept. 23 Marcus discussion (postponed)
Feldman background discussion

  • Feldman, J. (2008). From molecule to metaphor: A neural theory of language. MIT press. (look at Chapters 1, 2, 9, and 26 before class) pdf
  • Feldman, J. (2013). The neural binding problem (s). Cognitive neurodynamics, 7(1), 1-11. pdf
  • Feldman, J. & Narayanan, S. (2014). Affordances, Actionability, and Simulation. Affordances Workshop, Robotics Science and Systems 2014, Berkeley, CA pdf
  • Feldman, J. (2010). Ecological expected utility and the mythical neural code. Cognitive neurodynamics, 4(1), 25-35. pdf
  • F. T. Sommer: Neural oscillatons and synchrony as mechanisms for coding, communication and computation in the visual system. Chapter in: The New Visual Neurosciences, Eds.: Leo M. Chalupa and John S. Werner, MIT Press (2014) pdf_contents
Sept. 30 Discuss student projects
Wed. Oct. 1, 4:15 p.m., 489 Minor Hall Feldman lecture: The neural binding problem(s) and related mysteries videoAs with many other “problems” in vision and cognitive science, “the binding problem” has been used to label a wide range of tasks of radically different behavioral and computational structure. These include a “hard” version that is currently intractable, a feature-binding variant that is productive routine science and a variable-binding case that is unsolved, but should be solvable. The talk will cover all these and some related problems that seem intractably hard as well as some that are unsolved, but are being approached with current and planned experiments.
Oct. 7 Feldman discussion
Malik background discussion

  • R. Girshick, J. Donahue, T. Darrell and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation”. Proc. of CVPR 2014. pdf
  • P. Arbelaez, J. Pont-Tuset, J. Barron, F. Marques and J. Malik, “Multiscale Combinatorial Grouping”, Proc. of CVPR 2014. pdf
  • B. Hariharan, P. Arbelaez, R. Girshick and J. Malik, Simultaneous Detection and Segmentation. ECCV (7) 2014: 297-312. pdf
  • S. Gupta, R. Girshick, P. Arbelaez, J. Malik: Learning Rich Features from RGB-D Images for Object Detection and Segmentation. ECCV (7) 2014: 345-360. pdf
Tue. Oct. 14, 6 to 8 p.m. 560 Evans Jitendra Malik lecture: The Three R’s of Computer Vision: Recognition, Reconstruction and Reorganization video
Oct. 20, 12-1:30 p.m., Minor 489 Harold Bedell lecture: Contour interaction: as far from the muddling crowd? videoContour interaction describes the interference with target recognition that occurs in the presence of nearby flanking edges. As one of the pioneers in this research, Flom distinguished between contour interaction and crowding, in which contributions to spatial interference can derive also from additional factors, such as inaccurate eye movements and attentional processes. In the normal fovea, contour interaction and crowding have a similar magnitude and operate over a similar spatial extent. Both foveal contour interaction and crowding are reduced when the luminance of the stimulus is decreased. Unlike the fovea, the magnitude and extent of contour interaction in peripheral vision are considerably more limited than crowding. Further, peripheral contour interaction and crowding are not affected substantively by target luminance. Indeed, the magnitude and extent of peripheral contour interaction are similar for photopic and scotopic targets. These results suggest that the contributions of specific mechanisms may differ for foveal and peripheral contour interaction and crowding.

  • Siderov, J., Waugh, S. J., & Bedell, H. E. (2013). Foveal contour interaction for low contrast acuity targets. Vision research, 77, 10-13. pdf
  • Coates, D. R., & Levi, D. M. (2014). Contour interaction in foveal vision: A response to. Vision research, 96, 140-144. pdf
  • Siderov, J., Waugh, S. J., & Bedell, H. E. (2014). Foveal contour interaction on the edge: Response to ‘Letter-to-the-Editor’by Drs. Coates and Levi. Vision research, 96, 145-148. pdf
Oct. 21 Malik discussion
Nakayama and Shimojo background discussion

  • Nakayama, K. (1999). Mid-level vision. In R. A. Wilson & F. C. Keil (Eds.), The MIT encylopedia of the cognitive sciences Cambridge: MIT Press pdf
  • Nakayama, K. (2010) “Vision going social.” The science of social vision. Adams, R.B. Jr., Ambady, N., Nakayama, K. & Shimojo, S. (Eds) Oxford University Press pdf
  • Nakayama, K. and Martini, P. (2011) Situating Visual Search. Vision Research, 51, 1526-1537. pdf

(All Nakayama pubs available here)

  • Shimojo, S. (2014). Postdiction: its implications on visual awareness, hindsight, and sense of agency. Frontiers in psychology, 5. pdf
Oct. 28, 6-8 p.m., 560 Evans Ken Nakayama lecture: The scientist’s choice: solving, explaining, discovering . . . .
Nov. 3 (Monday)12:00 p.m. 489 Minor Hall Shinsuke Shimojo lecture: Postdiction: its implications on visual awareness, hindsight, and sense of agency video
Nov. 3 (Monday)3:30-4:30 p.m. 560 Evans Hall Discussion with Shinshuke Shimojo
Nov. 4 Nakayama and Shimojo discussion
Wandell background discussion

  • To appear: Computational modeling of responses in human visual cortex. BA Wandell, J Winawer, KN Kay.

In Brain Mapping: An Encyclopedic Reference (Edited by Thompson and Friston.) pdf

(Friday) Nov. 14 11 a.m., 560 Evans Hall Brian Wandell lecture
Nov. 18 Consciousness discussion
Nov. 25 Koch background discussion

  • Koch, C. Project MindScope pdf
  • Tsuchiya, N., & Koch, C. (2008). The relationship between consciousness and attention. The neurology of consciousness: Cognitive neuroscience and neuropathology, 63-78. pdf
  • Klein, S. A. (1993). Will robots see? Chapter in Spatial Vision in Humans and Robots, Cambridge University Press, 184-199. pdf
  • Tononi, G., & Koch, C. (2014). Consciousness: Here, There but Not Everywhere. arXiv preprint arXiv:1405.7089. pdf
  • Scientific American article
  • Scientific American article
Dec. 2, 4-6 p.m., 125 Li Ka Shing Christof Koch lecture: Unsolved Problems in Vision: Consciousness.Evening seminar: Koch Discussion


Additional Materials

  • recent special issue of CurrOpinNeuro journal
  • Olshausen BA Olshausen (2013) Perception as an Inference Problem. pdf
  • Olshausen BA (2012) 20 years of learning about vision: Questions answered, questions unanswered, and questions not yet asked. In: 20 Years of Computational Neuroscience (Symposium of the CNS 2010 annual meeting) pdf
  • Kitaoka, A (2014) Color-dependent motion illusions in stationary images and their phenomenal dimorphism. Perception advance online publication pdf
  • O’Regan, J. K., & Noë, A. (2001). A sensorimotor account of vision and visual consciousness. Behavioral and brain sciences, 24(05), 939-973.pdf
  • Bruno Olshausen lecture (1 July 2014) 20 Years of Learning About Vision: Questions Answered, Questions Unanswered, and Questions Not Yet Asked video
  • Solari, S. V. H., & Stoner, R. (2011). Cognitive consilience: primate non-primary neuroanatomical circuits underlying cognition. Frontiers in neuroanatomy, 5. pdf
  • Dyson, Freeman. The Case for Blunders. The New York Review of Books, 6 March 2014. pdf
  • Machine-Learning Maestro Michael Jordan on the Delusions of Big Data and Other Huge Engineering Efforts, 20 Oct 2014, Lee Gomes, IEEE Specturm link
  • Yann LeCunn responds to Mike Jordan’s Spectrum interview link
  • Kravitz, D. J., Saleem, K. S., Baker, C. I., Ungerleider, L. G., & Mishkin, M. (2013). The ventral visual pathway: an expanded neural framework for the processing of object quality. Trends in cognitive sciences, 17(1), 26-49. pdf
  • Vinyals, O. et al. Show and Tell: A Neural Image Caption Generator. 2014 arXiv.1411.4555v1 pdf
  • Koch, C., & Tononi, G. (2011). A test for consciousness. Scientific American, 304(6), 44-47. [pdf]