It is clear that animals do not build a 3D reconstruction of their environment in the way that computer vision does in SLAM systems. I will describe two experiments from our lab (mostly in VR but one reproduced in a real environment) that support this claim. In these experiments, people view targets (or come across them in a maze) and later have to point at them from memory. We document systematic biases in this task. In the real world, we find large systematic biases that follow a consistent pattern and, in physically impossible environments, we can obtain reliable errors of about 180 degrees. Whatever representation observers build must be quite different from the predictions that follow from scene-reconstruction-plus-path-integration.
Alternatives to 3D reconstruction are hard to find. For discussion, I will describe two relevant spheres. One is a representation of the optic array and how objects move on this sphere as the observer or optic centre translates (including binocular vision as a limiting case). The other is more abstract: it is the sphere describing the possible values of a unit vector containing sensory and ‘motivational’ (task-defining) information. For all animals (and even plants) this vector moves as the organism behaves. The dimensionality of the sphere and the number of recognised states grows throughout evolution. Brains are particularly useful in this regard but not strictly necessary.