individual neuron level. There are thirty-six other visual areas, and we will need to be able to scan these deeper regions
at very high resolution or place precise sensors to ascertain their functions.
A pioneer in understanding visual processing is MIT's Tomaso Poggio, who has distinguished its two
tasks as
identification and categorization.
98
The former is relatively easy to understand, according to Poggio, and we have
already designed experimental and commercial systems that are reasonably successful in identifying faces.
99
These are
used as part of security systems to control entry of personnel and in bank machines. Categorization—the ability to
differentiate, for example, between a person and a car or between a dog and a
catis a more complex matter, although
recently progress has been made.
100
Early (in terms of evolution) layers of the visual system are largely a feedforward (lacking feedback) system in
which increasingly sophisticated features are detected. Poggio and Maximilian Riesenhuber write that "single neurons
in the macaque posterior inferotemporal cortex may be tuned to ... a dictionary of thousands of complex shapes."
Evidence that visual recognition uses a feedforward system during recognition includes MEG studies that show the
human visual system takes about 150 milliseconds to detect an object. This matches the latency of feature-detection
cells in the inferotemporal cortex, so there does not appear to be time for feedback to
playa role in these early
decisions.
Recent experiments have used a hierarchical approach in which features are detected to he analyzed by later layers
of the system.
101
From studies on macaque monkeys, neurons in the inferotemporal cortex appear to respond to
complex features of objects on which the animals are trained. While most of the neurons respond only to a particular
view of the object, some are able to respond regardless of perspective. Other research on the visual system of the
macaque monkey includes studies on many specific
types of cells, connectivity patterns, and high-level descriptions of
information flow.
102
Extensive literature supports the use of what I call "hypothesis and test" in more complex pattern-recognition
tasks. The cortex makes a guess about what it is seeing and then determines whether the features of what is actually in
the field of view match its hypothesis.
103
We’re often more focused on the hypothesis than the actual test, which
explains why people often see and hear what they expect to perceive rather than what is actually there. "Hypothesis
and test" is also a useful strategy in our computer-based pattern-recognition systems.
Although we have the illusion of receiving high-resolution
images from our eyes, what the optic nerve actually
sends to the brain is just outlines and clues about points of interest in our visual field. We then essentially hallucinate
the world from cortical memories that interpret a series of extremely low-resolution movies that arrive in parallel
channels. In a 2001 study published in Nature, Frank S. Werblin, professor of molecular and cell biology at the
University
of California at Berkeley, and doctoral student Boton Roska, M.D., showed that the optic nerve carries ten
to twelve output channels, each of which carries only minimal information about a given scene.
104
One group of what
are called ganglion cells sends information only about edges (changes in contrast). Another . group detects only large
areas of uniform color, whereas a third group is sensitive only to the backgrounds behind figures of interest.
"Even though we think we see the world so fully, what we are receiving is really just hints,
edges in space and
time," says Werblin. "These 12 pictures of the world constitute all the information we will ever have about what's out
there, and from these 12 pictures, which are so sparse, we reconstruct the richness of the visual world. I'm curious how
nature selected these 12 simple movies and how it can be that they are sufficient to provide us with all the information
we seem to need." Such findings promise to be a major advance in developing an artificial system that could replace
the eye, retina, and early optic-nerve processing.
In chapter 3, I mentioned the work of robotics pioneer Hans Moravec, who has been reverse engineering the
image processing done by the retina and early visual-processing regions in the brain. For
more than thirty years
Moravec has been constructing systems to emulate the ability of our visual system to build representations of the
world. It has only been recently that sufficient processing power has been available in microprocessors to replicate this
human-level feature detection, and Moravec is applying his computer simulations to a new generation of robots that
can navigate unplanned, complex environments with human-level vision.
105
Carver Mead has been pioneering the use of special neural chips that utilize transistors in their native analog
mode, which can provide very efficient emulation of the analog nature of neural processing. Mead has demonstrated a
chip that performs the functions of the retina and early transformations in the optic nerve using this approach.
106
A special type of visual recognition is detecting motion, one of the focus areas of the Max Planck Institute of
Biology in Tubingen, Germany. The basic research model is simple: compare the signal
at one receptor with a time-
delayed signal at the adjacent receptor.
107
This model works for certain speeds but leads to the surprising result that
above a certain speed, increases in the I velocity of an observed object will decrease the response of this motion
detector. Experimental results on animals (based on behavior and analysis of I, neuronal outputs) and humans (based
on reported perceptions) have closely matched the model.
Do'stlaringiz bilan baham: