96 Psychoacoustics may be brand sounds like the “Intel Logo” or a “Nokia ringtone,” or familiar
recordings like the Wilhelm scream. Each time we hear them the same pattern
of bits produces more or less the exact same waveform. However, humans can
recognise sounds by their intrinsic mechanism. The sound of a familiar voice
speaking words we have never heard before is one example. Tom’s Moto Guzzi
is making patterns of vibrations it has never made before, yet it’s recognisable
as that definite article from its general behaviour.
Attention Attention is what we pay to important or pleasing signals. We focus on sonic
objects just as we focus on things we see. Even though signals arrive from many
sources all jumbled into the same wave we can pick out the individual trans-
mitters like radio stations. The so-called
cocktail party effect is an example of
attention. So, attention is some tuning of perception. Many experiments have
been done that conclude that attention is something that happens at quite a
low level in the brain/nervous system. Much as we can focus on objects with
our eyes, in hearing we are able to tune the ear to filter out things we are not
expecting, or don’t want to hear. For humans this happens at the neural level,
probably because we don’t have “ear lids”; but some animals can direct their
ears to attend to different sources.
Correspondence Attention is sharply focused by visual
correspondence , involving innate pro-
cessing to compensate for movement and distance perception. In a cinematic
context the deliberate or implied bindings of explicit visual events to sounds
is
diegesis (from the Greek meaning “a story told”). Naturally we try to bind
things we see to things we hear. When a pot falls off the table and breaks,
each piece of china has its own frequency that matches its size. With a proper
correspondence between the sounds and images of pieces landing we feel the
scene makes sense. Although the scene is composed of many events in quick
succession and many concurrent processes, we are able to give several things
our attention at once. How many is unsurprisingly about 5 or 6, or Miller’s
number. In a collection of sources, like a passing stampede of galloping horses,
only one may be in the visual frame at any time. That is the one that has focus
and the object to which we try and bind attention. It is
synchronised sound.
In the background we see many other horses moving by. Should we synchronise
sounds to every single hoof to get a realistic effect? No; in fact we only need
to synchronise the few in focus, and add a general filler effect to account for
the rest. Randy Thom suggests that in cinematic sound an adequate efficiency
can go as far as using only one single or maybe a pair of front sounds against
a background texture. In the audiovisual realm we often group things as one,
two, and lots.