Computer vision has advanced rapidly with deep learning, achieving super-human performance on a few recognition benchmarks. At the core of the state-of-the-art approaches for image classification, object detection, and semantic/instance segmentation is sliding-window classification, engineered for computational efficiency. Such piecemeal analysis of visual perception often has trouble getting details right and fails miserably with occlusion. Human vision, on the other hand, thrives on occlusion, excels at seeing wholes and parts, and can recognize objects with very little supervision. I will describe several works that build upon concepts of perceptual organization, integrate multiscale and figure-ground cues, learn to develop pixel and image relationships in a data-driven fashion, with no annotations at all or with lesser and fewer annotations, in order to deliver more accurate and generalizing performance beyond recognition in a closed world. Our recent works can not only capture apparent visual similarity without perceptual organization priors or any feature engineering, but also provide powerful exploratory data analysis tools that can seamlessly integrate external domain knowledge into a data-driven machine learning framework.
Stella Yu received her Ph.D. from Carnegie Mellon University, where she studied robotics at the Robotics Institute and vision science at the Center for the Neural Basis of Cognition. She continued her computer vision research as a postdoctoral fellow at UC Berkeley, and then studied art and vision as a Clare Booth Luce Professor at Boston College, during which she received an NSF CAREER award. Dr. Yu is currently the Director of Vision Group at the International Computer Science Institute (ICSI) and a Senior Fellow at the Berkeley Institute for Data Science (BIDS) at UC Berkeley. Dr. Yu is interested not only in understanding visual perception from multiple perspectives, but also in using computer vision and machine learning to capture and exceed human expertise in practical applications.
I’ll present an approach from mathematical logic which shows how sub-symbolic dynamics may give rise to higher-level cognitive representations of structures, systems of knowledge, and algorithmic processes. This approach posits that learners posses a system for expressing isomorphisms with which they create mental models with arbitrary dynamics. The theory formalizes one account of how novel conceptual content may arise, allowing us to explain how even elementary logical and computational operations may be learned. I provide an implementation that learns to represent a variety of structures, including logic, number, kinship trees, regular languages, context-free languages, domains of theories like magnetism, dominance hierarchies, list structures, quantification, and computational primitives like repetition, reversal, and recursion. Moreover, the account is based on simple discrete dynamical processes that could be implemented in a variety of different physical or biological systems. In particular, I describe how the required dynamics can be directly implemented in an existing connectionist framework. The resulting theory provides an “assembly language” for cognition, where high-level theories cognition and computation can be translated into simple and neurally plausible underlying dynamics.
If biology is the study of self-replicating entities, and we want to understand the role of information, it makes sense to see how information theory is connected to the ‘replicator equation’ — a simple model of population dynamics for self-replicating entities. The relevant concept of information turns out to be the information of one probability distribution relative to another, also known as the Kullback–Liebler divergence. Using this we can get a new outlook on free energy, see evolution as a learning process, and give a clearer, more general formulation of Fisher’s fundamental theorem of natural selection.