This course provides an introduction to the theory of neural computation. The goal is to familiarize students with the major theoretical frameworks and models used in neuroscience and psychology, and to provide hands-on experience in using these models. Topics include neural network models, supervised and unsupervised learning, associative memory models, recurrent networks, probabilistic/graphical models, and models of neural coding in the brain.

**Instructor**: Bruno Olshausen, 570 Evans, office hours immediately after class

**GSI**: Shariq Mobin, 560 Evans, office hours 2:30 – 4:00 pm on Mondays.

**Lectures**: Tuesdays & Thursdays 3:30-5, 170 Barrows.

**Grading**: based on weekly assignments (60%) and final project (40%)

- Late homework will not be accepted but your lowest homework score will be dropped at the end of the semester
- Each homework will be graded holistically out of 3 points:
- 3: problems were done correctly aside from minor errors
- 2: problems were attempted but some portions or concepts were missed
- 1: something relevant was done but no clear direction or mostly incomplete
- 0: problems were not attempted

- You have 1 week to dispute your grade from the time they are released on bCourses
- Final project guidelines:
- 5 page report + poster or oral presentation at project presentation day, likely Dec. 11 or 12 of finals week.
- You are encouraged to work in teams of 3-4 students.
- The project itself may explore one of the lab assignments (but not limited to this) in more depth, either mathematically or computationally, or it can also be a critical analysis of the history of neural computation, or the prospects for how these approaches can be used to inform our understanding of the brain.

**Textbooks**:

- [
**HKP**] Hertz, J. and Krogh, A. and Palmer, R.G.*Introduction to the theory of neural computation.*Amazon - [
**DJCM**] MacKay, D.J.C.*Information Theory, Inference and Learning Algorithms.*Available online or Amazon - [
**DA**] Dayan, P. and Abbott, L.F.*Theoretical neuroscience: computational and mathematical modeling of neural systems.*Amazon

**Discussion forum**: We have established a Piazza site where students can ask questions or propose topics for discussion.

**Syllabus**

**Aug. 23: Introduction**

- Theory and modeling in neuroscience
- Goals of AI/machine learning vs. theoretical neuroscience
- Lecture slides
- Reading:
**HKP**chapter 1- Dreyfus, H.L. and Dreyfus, S.E. Making a Mind vs. Modeling the Brain: Artificial Intelligence Back at a Branchpoint.
- Bell, A.J. Levels and Loops: the future of artificial intelligence and neuroscience.
- 1973 Lighthill debate on future of AI

- Homework 0:
- Python Installation
- Lab 0 tutorial (start with lab0_1-Introduction-to-Python-for-Scientific-Computing.html, then once in jupyter proceed to lab0_2 and lab0_3)

**Aug. 28, 30: Neuron models**

- Membrane equation, compartmental model of a neuron
- Physics of computation, Analog VLSI
- Lecture slides
- Reading:
- Mead, C., Chapter 5: Transconductance amplifier from
*Analog VLSI and Neural Systems*. - Mead, C., Chapter 15: Silicon retina from
*Analog VLSI and Neural Systems*.Reading - Mead, C. Chapter 1: Introduction and Chapter 4: Neurons from
*Analog VLSI and Neural Systems*. - Carandini M, Heeger D (1994) Summation and division by neurons in primate visual cortex.
- Background on dynamics, linear time-invariant systems and convolution, and differential equations:
- Background on transistor physics and analog VLSI circuits:
- Mead, C., Chapters 2 & 3 from
*Analog VLSI and Neural Systems*.

- Mead, C., Chapters 2 & 3 from
- Extra background on neuroscience:
- Cognitive Consilience, by Solari & Stoner (nice overview of brain architecture and circuits)
- From Neuron to Brain, by Nicholls, et al. (good intro to neuroscience)
- Principles of Neural Science, by Kandel and Schwartz et al. (basic neuroscience textbook)
- Synaptic organization of the Brain, by Gordon Shepard (good overview of neural circuits)
- Ion Channels of Excitable Membranes, by Bertil Hille (focuses on ion channel dynamics)

- Mead, C., Chapter 5: Transconductance amplifier from
- Lab1 Neuron Models (v2) (due Tuesday Sept. 4 before class) (old version)
- To complete the lab you need to fill in text, latex, and code in the indicated spaces (Your TEXT/LATEX/CODE here) of lab1.ipynb. Please indicate your name and anyone you work with at the top of the file. To submit the lab rezipthe folder lab1-neuron_models and upload it to bcourses.
- Solutions

**Sept. 4, 6: Supervised learning**

- Perceptron model and learning rule
- Adaptation in linear neurons, Widrow-Hoff rule
- Objective functions and gradient descent
- Multilayer networks and back propagation
- Lecture slides
- Reading:
**HKP**chapter 5,**DJCM**chapters 38-40, 44,**DA**chapter 8 (sec. 4-6)**HKP**chapter 6- Linear Neuron Models
- Supervised learning in single-stage feedforward networks
- Supervised learning in multi-layer feedforward networks – “back propagation”
- Background on linear algebra
- Linear algebra primer
- Jordan, M.I. An Introduction to Linear Algebra in Parallel Distributed Processing in McClelland and Rumelhart,
*Parallel Distributed Processing*, MIT Press, 1985.

- Further reading: Y. LeCun, L. Bottou, G. Orr, and K. Muller (1998) “Efficient BackProp,” in Neural Networks: Tricks of the trade.

- NetTalk demo
- Lab 2 Supervised Learning (due Thursday Sept. 13th before class) (Solutions)

– Please include your group members names in your lab2.ipynb file.

**Sept. 11, 13, 18: Unsupervised learning**

- Linear Hebbian learning and PCA, decorrelation
- Winner-take-all networks and clustering
- Lecture slides
- Reading:
**HKP**Chapters 8 and 9,**DJCM**chapter 36,**DA**chapter 8, 10- Hebbian learning and PCA

- Further reading on neuroscience implications:
- Atick & Redlich (1992). What does the retina know about natural scenes?
- Dan, Atick & Reid (1996). Efficient Coding of Natural Scenes in the Lateral Geniculate Nucleus: Experimental Test of a Computational Theory.

- Lab 3 – Unsupervised Learning (Due Thursday Sept. 20th before class) (Solutions)

– lab3_2.ipynb is optional (same algorithms but with pictures of faces instead of 2-dimensional points)

– There is also a file that makes clear our matrix conventions in this class, it is also optional.

**Sept. 20, 25, 27: Sparse, distributed coding**

- Autoencoders
- Natural image statistics, projection pursuit
- Sparse coding model
- Locally competitive algorithms (LCA)
- Lecture slides (Olshausen)
- Lecture slides (Dylan Paiton)
- Reading:
- Barlow, HB (1972) Single units and sensation: A neuron doctrine for perceptual psychology?
- Foldiak, P. (1990) Forming sparse representations by local anti-hebbian learning
- Olshausen BA, Field DJ (1996) Emergence of simple-cell receptive field properties by learning a sparse code for natural images.
- Handout on sparse coding via LCA

- Additional readings:
- Rozell, Johnson, Baraniuk, Olshausen. (2008) Sparse Coding via Thresholding and Local Competition in Neural Circuits.
- Zylberberg, Murphy, DeWeese, (2011) A sparse coding model with synaptically local plasticity and spiking neurons can account for the diverse shapes of V1 simple cell receptive fields.
- Olshausen Sparse coding of time-varying natural images, ICIP 2003. (convolution sparse coding of video)
- Olshausen Highly Overcomplete Sparse Coding, SPIE 2013.
- Smith E, Lewicki MS. Efficient auditory coding, Nature Vol 439 (2006). (convolution sparse coding of sound)
- Olshausen & Lewicki.What Natural Scene Statistics Can Tell Us About Cortical Representation. TNVN 2014

- Lab 4 – Sparse Coding

– lab4_2.ipynb is optional (Foldiak Model)

– Solutions

**Oct. 2, 4: Self-organizing maps**

- Plasticity and cortical maps
- Self-organizing maps, Kohonen nets
- Lecture slides
- Reading:
**HKP**chapter 9,**DA**chapter 8- Miller, Keller & Stryker (1989) Ocular dominance column development: Analysis and simulation
- Durbin & Mitchison (1990) A dimension reduction framework for understanding cortical maps
- Horton & Adams (2005) The cortical column: a structure without a function

- Further background:
- Blasdel (1992), Orientation selectivity, preference, and continuity in monkey striate cortex.
- Another source of many of nice images are in the galleries on Amiram Grinvald’s site: [1]
- From Clay Reid’s lab, Functional imaging with cellular resolution reveals precise micro-architecture in visual cortex. Make sure you look at the supplementary material and videos on their web site (seems partly broken) [2].
- Gilbert & Wiesel (1992), Receptive Field Dynamics in Adult Primary Visual Cortex
- Pettet & Gilbert (1992), Dynamic changes in receptive-field size in cat primary visual cortex

- Lab 5 – Kohonen Maps (due Oct 11th)
- Lab5_2 optional
- Solutions

**Oct. 9: Manifold learning (Chen)**

- Local linear embedding, Isomap
- The sparse manifold transform
- Lecture slides:
- Lecture slides (Chen)
- Lecture slides – adaptation

- Reading:
- A Global Geometric Framework for Nonlinear Dimensionality Reduction , Tenenbaum et al., Science 2000.
- Nonlinear Dimensionality Reduction by Locally Linear Embedding, Roweis and Saul, Science 2000.
- The Sparse Manifold Transform, Chen, Paiton & Olshausen, 2018.

- Further background:
- On the Local Behavior of Spaces of Natural Images, Carlsson et al., Int J Comput Vis (2008) 76: 1–12.
- Adaptation to natural facial categories, Michael A. Webster, Daniel Kaping, Yoko Mizokami & Paul Duhamel, Nature, 2004.
- Prototype-referenced shape encoding revealed by high-level aftereffects, Leopold, O’Toole, Vetter and Blanz, Nature, 2001.
- A Morphable Model For The Synthesis Of 3D Faces, Blanz & Vetter 1999.
- Matthew B. Thompson’s web page on flashed face distortion effect

**Oct. 11: Reinforcement learning (Mobin)**

- Reward-based learning
- Predicting future rewards via temporal-difference learning
- Lecture slides

**Oct. 16, 18, 23, 25: Attractor neural networks**

- Hopfield networks, memories as ‘basis of attraction’
- Line attractors and `bump circuits’
- Lecture slides
- Lecture slides – three-way interactions/dynamic routing
- Reading:
**HKP**Chapter 2 and 3 (sec. 3.3-3.5), 7 (sec. 7.2-7.3),**DJCM**chapter 42,**DA**chapter 7- Handout on attractor networks – their learning, dynamics and how they differ from feed-forward networks
- Hopfield (1982)
- Hopfield (1984)

- Additional background:
- Willshaw (1969)
- Marr-Poggio stereo algorithm
- Kechen Zhang paper on bump circuits

- Three-way interactions/dynamic routing:
- Hinton (1981) – A parallel computation that assigns canonical object-base frames of reference
- Olshausen (1993) – Dynamic routing circuit
- Tenenbaum & Freemen (2000) – Bilinear models

- Lab 6 – Hopfield networks (due Oct. 25)

**Oct. 30, Nov. 1, 6: Probabilistic models and inference**

- Probability theory and Bayes’ rule
- Learning and inference in generative models
- The mixture of Gaussians model (Charles Frye)
- Lecture slides (Olshausen)
- Lecture slides (Frye)
- Reading:
- HKP chapter 7 (sec. 7.1),DJCM chapter 1-3, 20-24,41,43, DA chapter 10
- Olshausen (2014) Perception as an Inference Problem
- A probability primer
- Bayesian probability theory and generative models
- Mixture of Gaussians model

- Additional background:
- D.J.C. MacKay, Bayesian Methods for Adaptive Models (Ph.D. Thesis)
- T.J. Loredo, From Laplace to Supernova SN 1987A: Bayesian Inference in Astrophysics
- Lee & Mumford, Hierarchical Bayesian inference in the visual cortex.

- Additional background from Charles Frye:
- Differential Equations View of the Gaussian Family. Derives key properties of the Gaussian family from one differential equation.
- Leinster, How the Simplex is a Vector Space. Explains why logarithms of probabilities are in some ways more natural.
- Neal and Hinton, A View of the EM Algorithm. Takes a statistical-physics-inspired view and describes the most common interpretation of EM.
- Csiszar and Tusnady, Information Geometry and Alternating Minimization Techniques. This is the original paper that noted the connection between a variety of alternating minimization techniques in non-Euclidean geometries, including EM. The later paper by Amari – Information Geometry of the EM Algorithms for Neural Networks – is easier going, but still requires a bit of differential geometry to fully digest.

- Lab 7 – Mixture of Gaussians model (due Nov. 8)

**Nov. 8: Boltzmann machines**

- Sampling, inference and learning rules
- Restricted Boltzmann machines and Energy-based models
- Lecture slides
- Reading:
- Application of Boltzmann machines to neural data analysis:
- E. Schneidman, M.J. Berry, R. Segev and W. Bialek,Weak pairwise correlations imply strongly correlated network states in a neural population, Nature 4400 (7087) (2006),
- J. Shlens, G.D. Field, J.L. Gauthier, M.I. Grivich, D. Petrusca, A. Sher, A.M. Litke and E.J. Chichilnisky, The structure of multi-neuron firing patterns in primate retina, J Neurosci 260 (32) (2006), pp. 8254-8266.
- U. Koster, J. Sohl-Dickstein, C.M. Gray, B.A. Olshausen, Modeling higher-order correlations within Cortical Microcolumns, PLOS Computational Biology, July 2014.

- Lab 8

**Nov. 13: Independent Components Analysis (ICA)**

- Relation between sparse coding and ‘ICA’
- Applications
- Lecture slides
- Reading:
- DA Chapter 10
- Information theory primer
- Handout on Sparse Coding and ICA

- Additional reading:
- Jascha Sohl-Dickstein, The Natural Gradient by Analogy to Signal Whitening, and Recipes and Tricks for its Use
- Jascha Sohl-Dickstein, Natural gradient cookbook
- Bell & Sejnowski, An Information-Maximization Approach to Blind Separation and Blind Deconvolution, Neural Comp, 1995.
- Simoncelli, Olshausen. Natural Image Statistics and Neural Representation, Annu. Rev. Neurosci. 2001.

**Nov. 15, 20: Dynamical models**

- Hidden Markov models
- Kalman filter model
- Recurrent neural networks

**Nov. 27, 29: Neural coding**

- Integrate-and-fire model
- Neural encoding and decoding
- Limits of precision in neurons
- GLMs

**Dec. 4, 6: High-dimensional (HD) computing**

- Holographic reduced representation; Vector symbolic architectures
- Computing with 10,000 bits
- Sparse, distributed memory