Artificial neural networks are capable of sophisticated vision tasks, including recognizing complex object families and captioning images. But very little is known about how they accomplish this. What happens if we take such a neural network seriously as a kind of “model organism?” In this talk, we give a neuron by neuron account of low-level visual features in InceptionV1, and demonstrate that the algorithms implementing these features can be read off from the circuits of weights connecting them.
About Clarity: The OpenAI Clarity works to understand how, mechanistically, neural networks implement the complex behaviors they do. Members of the Clarity team were involved in DeepDream, Feature visualization, the Building Blocks of Interpretability, and Activation Atlases. The talk will likely be given by Christopher Olah; several members of Clarity will be available for discussion after the talk. Additionally, Nick Barry (one of Ed Boyden’s students, visiting Clarity for January) will be present to help act as a translator between interpretability and neuroscience.