Acoustic Space Architecture and Acoustic
Acoustic Space Architecture and Acoustic
Acoustic Space Architecture and Acoustic
Barry Truax
Abstract
The acoustics of a space, particularly an enclosed one, has been extensively studied over the last
100 years, resulting in a significant body of theoretical and applied literature. Although the
acoustic complexity of real spaces may exhibit subtleties that require further research, the
general principles involved seem well established. However, the perception of acoustic space –
how we interpret sound as creating a sense of space – is not well understood. Perhaps the
greatest impediment is our reliance on visual models of space which are relatively stable and
detailed, giving us the impression that space is a fixed entity through which we can move.
Auditory space, on the other hand, is constantly in flux, created moment by moment and variable
to each listener. In the most general sense, acoustic or auditory space is at the core of acoustic
ecology, that is, the relationship of individuals to their environment as mediated by sound.
Recently, a number of composers have begun studying the acoustic spaces created by various
soundscapes, with the aim of evaluating the quality of their design, as well as finding inspiration
for what is generally termed soundscape composition. Multi-channel reproduction techniques
have greatly aided these latter efforts, and created a three-dimensional design field comparable
to architecture. The paper will discuss issues surrounding acoustic space, soundscape design
and multi-channel soundscape composition.
1. Introduction
The modern science of acoustics over the last 100 years has broadly treated the spatial aspect of
sound in two contexts: propagation in a free field, and the behaviour of sound fields in enclosed
spaces, the latter being the basis of what is known as architectural acoustics. This work has
resulted in a significant body of theoretical and applied literature, including many approaches to
the complex problem of modeling the acoustical properties of actual and proposed spaces.
Although the acoustic complexity of real spaces may exhibit subtleties that require further
research, the general principles involved seem well established. However, the perception of
acoustic space – how we interpret sound as creating a sense of space – is not well understood.
Perhaps the greatest impediment is our reliance on visual models of space which are relatively
stable and detailed, giving us the impression that space is a fixed entity through which we can
move. The practice of architectural design is similarly characterized by an emphasis on the visual
aspects of space, with few schools until recently giving any thought to the acoustic aspects of
design.
How does the auditory perception of space differ from its visual counterpart? And how are the two
related? The most fundamental difference is that the auditory perception of space depends
entirely on time, meaning that it is in a constant state of flux. I will argue in this paper that the time
domain is central to two related aspects of auditory space – the space or “volume” within a sound,
Truax-1
and the sense of space created by all of the sounds within a soundscape [1]. Clearly I am putting
the emphasis on the human perception of auditory space as to how we interpret acoustic cues,
which therefore is the domain of psychoacoustics. However, my goal is broader than that,
because I will argue that the perception of acoustic space, and our perceived orientation within it,
is a central concern of acoustic ecology, an emerging field of study whose main concern is the
relation of the individual to an environment as created by sound, and by extension, the
relationship between a community and its soundscapes.
Modern architectural theory often suggests that what we build is not simply placed “in” Cartesian
space (which is assumed to be uniform in all directions), but rather that what we design and build
“creates” space. Similarly, I am suggesting that sound creates auditory space and that the sounds
we hear create perceived volumes within that space, which even in some cases of total
immersion become the space itself. How is this concept related to traditional acoustics? The
answer can be found in the most basic acoustical concepts related to motion and time, namely
frequency and the speed of sound [2].
These two concepts, frequency and speed, are often confused by laypersons because they both
refer to the temporal behaviour of vibratory motion. In classical acoustics, they are related by the
concept of wavelength, at least for simple harmonic motion, as illustrated by the equation:
f = c/λ
where f is the frequency of vibration in cycles per second or Hertz, c is the speed of sound in feet
or metres per second, and λ is the wavelength of the vibration in feet or metres. Given that the
speed of sound is constant for a given medium at a certain temperature, frequency is inversely
related to wavelength, with high frequencies having short wavelengths and low frequencies
having long ones. Frequency can be thought of at the micro level as the rate of change of phase
of a vibration, where the speed of sound is its rate of propagation through the medium, which for
air is relatively slow, at least compared to light, being around 1100 ft/sec or 330 m/sec. A useful
rule of thumb is that sound travels about a foot in a millisecond, keeping in mind that all
frequencies travel at the same speed, meaning that complex vibrational patterns travel coherently
from source to destination.
What does this basic equation from acoustics have to do with acoustic space? Quite simply, it is
the complex pattern of simultaneous vibrations or frequencies within a sound that creates its
sense of perceived volume or internal acoustic space, whereas it is the brain’s ability to detect the
small time differences caused by a sound reaching our ears by different paths, such as those
caused by reflections, that creates of sense of an external acoustic space. On the other hand, the
speed of light being extremely fast compared to sound means that the light we perceive coming
from all objects in our world (but not from the stars!) seems to arrive instantaneously at our eyes.
Let us consider some examples. A single frequency such as that of a sine tone possesses no
volume, so if it is unchanging in time, it creates a one-dimensional percept. Adding a few more
frequencies to it, but keeping them all synchronized in time, adds a richer sense of timbre, but still
only a two-dimensional sense of volume. However, if a complex set of temporal envelopes are
applied to these same frequencies at the micro level, or what we call granular synthesis, and
those envelopes are all unsynchronized, then we perceive a definite sense of three-dimensional
volume. All environmental sounds possess this sense of volume, sometimes very large, such as a
waterfall heard close by, or very small, such as a ticking clock heard farther away. The exception
to this sense of volume is synthetic electronic or digital sounds that have the fixed waveform of
our first examples.
Author-2
Sound Ex. 1: Sine tone (one-dimensional volume)
Sound Ex. 2: Sawtooth wave (adding spectral richness, a second dimension)
Sound Ex. 3: Granular synthesis texture (3-dimension sense of volume)
Sound Ex. 4: Two examples of boatbuilding (environmental sounds with differing volume)
Closely linked to our sense of perceived volume of a sound, is our ability to trade off size with
distance. Our visual ability in this respect is well known when we have an experiential reference.
Knowing the usual size of a human, we assume that a person who appears smaller must be more
distant. A photograph or painting that includes such familiar forms works similarly as long as
accurate perspective is maintained, undistorted by a lens or by a painter who, for instance, uses
fore-shortening of distance. On the other hand, a more abstract texture in a photograph, such as
in an extreme close-up might be mistaken for an aerial photograph of a desert landscape.
Similarly with sound, we perceive the source of a sound to have a certain size or volume, and if
its loudness decreases, we assume it is more distant, not giving off less energy.
In discussing the speed of sound propagation, I noted that all frequencies travel at the same
speed, hence vibrations are transmitted coherently with no disruption in phase (although if the
source is a loudspeaker tweeter and woofer, this coherence can no longer be taken for granted).
However, every interaction of the sound wave with the medium of transfer, the distance of
transfer, and most importantly, with its encounters with solid obstacles, or within an enclosed
space, changes the relative strength of the various frequencies involved because of absorption
and resonance. A sound outdoors in a relatively open space will not have any low frequency
boost that the same sound, such as a voice, would have in an enclosed room. Therefore, when
we hear that voice over a telephone, which doesn’t transmit frequencies lower that 300 Hz, the
voice sounds more distant. Therefore, what we arrive at is an intertwining of sound with physical
space. Every sound we hear carries information about the vibrational pattern of the source and
the physical space through which it has travelled. It is one of the most amazing abilities of the
brain that it can decipher both kinds of information simultaneously.
A single reflection of a sound wave can produce an echo if the returning sound doesn’t fuse with
the original and occurs around 100 ms after it (50 ms is the theoretical limit of the auditory
detection of echoes, but architectural acoustics accepts any early reflection arriving within 80 ms
as reinforcing the original sound and not contributing to reverberation). Multiple reflections, such
as in an enclosed or semi-enclosed space creates reverberation, the accurate simulation of which
continues to challenge those designing digital signal processors. Not only are the reflections
numerous temporally, but they are also complex in the frequency domain, the totality of which
might be called the acoustic “signature” of the space. If we record a sudden, short broadband
sound in a space including the resulting reverberation (what is called the “impulse response”), we
can simulate any other sound being perceived to be in that space by the process call convolution.
When we convolve the given sound (preferably recorded in a dry space with little environmental
colouration) with the impulse response of a space, the result is that our given sound appears to
be located in that space, because it is coloured with the space’s frequency response and
reverberant decay. What convolution does is to multiply the two spectra (or frequency content)
together such that any frequency that is strong in both is very strong in the output, and
conversely, any that are weak are strongly attenuated [3]. Again, sound and space are linked.
What this close connection of sound and space means is that sonic events in time are required
for us to hear acoustic space, whereas we imagine an architectural space to be independent of
who or what is present within it. For a blind person, then, the cessation of movement or activity
means that that aspect of the world “disappears”. The auditory world is entirely dynamic and can
Author-3
never be static. Sound requires motion (within a certain range of audible frequencies), audible
sound is the result of that motion, and that motion creates space when perceived.
What psychoacousticians are still investigating is our auditory ability to sort out the complex
vibration that arrives at each ear, which is usually the result of several sources of vibration being
added together, and perceive a coherent “auditory scene” populated with identifiable entities at
various distances [4]. One aspect of this process is called binaural localization, which refers to
our ability to detect the direction of a source. Differences in time of arrival at the two ears, and
differences in intensity between the sound at each ear, are the primary cues. However, even
subtler cues are present that distinguish when sounds are in front, as opposed to coming from
behind, and when they are higher or lower than ear level. These cues are a subtle colouring of
the sound in the upper frequencies by the external ear flaps, or pinnae. Ridges on the pinna
create small reflections of an incoming sound which, when combined with the direct version,
results in a cancellation of certain high frequencies and a slight boost to others.
A major component of auditory scene analysis – sorting out complex vibrations into separate
sources – is the detection of coherent patterns of those sources. A voice coming from a certain
direction will have a pattern that is different from one coming from a different direction. The ability
to follow one or the other voice at will, or to switch attention rapidly between them, is called
“cocktail party effect” [2]. It involves the brain’s ability to suppress one pattern while enhancing
another, and assumes that a coherent set of vibrations that emanate from the same direction are
probably from the same source. On the other hand, later arriving vibrations, such as the
reverberant tail of a sound, are random, uncorrelated vibrations which are interpreted as
indicative of the space where the voices are located. Or to use the language of this paper, the
uncorrelated sound creates the sense of acoustic space, within which correlated patterns with
their own sense of volume and distance, are interpreted as sources [1].
The auditory sense of sources and space is usually confirmed by visual cues, but not necessarily.
In one of the World Soundscape Project’s recordings from a small town in Italy, there are three
distinct sources that can be easily discerned, each creating their own sense of acoustic space. In
the foreground of the piazza are men talking, their voices brightened by the surrounding,
reflective surfaces which also adds a degree of reverberation suggesting the size and shape of
the physical space. Simultaneously, an unseen choir is heard coming inside a church facing the
square, its muffled sound indicating both its distance and the sense of being heard through a wall.
Also simultaneously occurring are the bells of another church on the other side of the village and
not visible in the square, providing a sense of a distant horizon to the complex acoustic space of
the recording. Both the physical space and the social space of the community are revealed in the
acoustic space of that soundscape.
Listening to any recording, of course, requires us to try to identify the soundscape without a visual
reference. However, even with a good quality stereo microphone (or multiple mikes), the
soundscape has been subtly distorted, similar to how a camera lens re-presents a scene. In most
cases, the auditory space is flattened out, just as photographs usually flatten out distant objects in
a scene. Sounds coming from behind the recordist may be repositioned in front by the listener,
since the mikes do not have the binaural colourations expected by the ear. Binaural or
“kunstkopf” recording provide a more vivid sense of an acoustic space, but have to be listened to
on headphones. But, even if subtle or less subtle distortions of the actual soundscape are present
in the recorded versions, the auditory system can usually produce a reasonable image of the
original space.
Author-4
4. Acoustic ecology and the architectural design of acoustic space
If acoustic ecology is concerned with the relationship of the individual listener and communities of
listeners to their environment as mediated by sound, then the individual and collective perception
of acoustic space must play a fundamental role. Perhaps the most basic role is that of orientation.
The habitual sounds we experience daily both reflect and confirm our sense of physical space, as
well as our place within it. Individuals and communities have a definite sense of “what belongs” in
their acoustic space, and what kinds of noise are “invasions” of that space. The World
Soundscape Project (WSP) has referred to such intrusions as “sound pollution” as distinct from
noise pollution which is generally defined by degrees of harmfulness. In other words, familiar
sounds and their temporal patterns define and characterize our sense of space. Even subtle
changes to the habitual pattern (which we usually take for granted) may be noted; for example, “it
seems too quiet here today” when we sense that something is missing, or the opposite,
“something special must be going on”. The characteristic ambience of a given space adds to the
“feel” of it, even if we would be hard pressed to define what contributes to that character. Often it
is what the WSP calls the “keynote” sounds – those that are in the background of our perception
but most typify a space. Foreground sonic events, or “signals”, may provide specific information
which we know how to interpret, even if fleetingly, and culturally important sounds recognizable to
all in a community can be termed “soundmarks” [2].
I have suggested that an information rich, balanced soundscape contributes to the sense of an
acoustic community, one where sound plays a formative role in the definition and life of a group of
people, no matter how their commonality is defined [5]. Sound will define what is the boundary of
the community, whether the scale is small or large, by distinguishing between what is “local” from
what comes the “outside”. In one study of an acoustically defined neighbourhood of Vancouver
which is bisected by a busy thoroughfare, some locals referred to those passing through as “the
others”. In other words, traffic moving in one set of directions was regarded as “other”, and that
moving in a different set of directions was “local” – and in fact, the latter was characterized by a
greater pedestrian component along with slower moving cars. The two sonic components of the
soundscape thus created a mental map to the locals that reflected these intersecting spaces, and
in fact most of the people interviewed could draw a version of such a map.
The “enemy” of the acoustic community is not so much noise per se, but rather any element that
lessens the clarity and definition of an acoustic space, or dulls people’s inclination to listen. In
other words, the acoustic community depends on information exchange, and anything or any
habit that detracts from or inhibits that exchange weakens the sense of community. Bland,
uniform sounds that lack character or are not perceived to be on a human scale might be the
most obvious culprits, such as broadband noise from ventilators or machinery or excess amounts
of traffic. Although such sounds may be acoustically complex in some sense, they are usually
perceived as lacking in information or character. Even worse, they frequently mask other, more
individualistic sounds. In the language of ecology, a few dominant species with little diversity
crowd out numerous smaller species that are able to co-exist. Just as the loss of genetic diversity
is a problem, so is the loss of aural complexity.
Besides the effects of orientation and the communication of information, an acoustic space can
also encourage various types of interaction. An early study of the soundscape of Boston termed
the positive character of such interactions as “responsive spaces” [6]. In other words, the
fundamental acoustic principles of reflection, resonance and absorption, all of which contribute to
the sense of acoustic space, are the main variables which can be designed to promote (or deter)
human interaction. The details and variety of approaches to this topic are beyond the scope of
the paper, but perhaps a brief look at the extreme of the continuum will clarify its nature. At one
end we have the “free field” where there is little or no reflection because of the lack of any barriers
to reflect the sound (though in real situations there is always the ground). The extreme end is the
Author-5
anechoic chamber where absorption is maximized and reflection minimized, and usually this type
of acoustic space is disorienting to the individual because there is no interaction, no feedback,
and essentially no acoustic “space”. The other end of the continuum is the “diffuse sound field”
which maximizes reflection (and resonance if the space is smaller) and minimizes absorption. A
marble-lined space, an indoor swimming pool with highly reflective glass and water, or a
gymnasium with polished floors and high ceiling are common examples. Sound comes from
everywhere and nowhere; the acoustic space is omnidirectional and often equally disorienting as
the anechoic room, except for the opposite reason. If one did not have to act or communicate in
such a space, one could enjoy the womb-like envelopment, but otherwise the eyes have to be
alert for orientation, verbal communication is almost impossible, and noise levels tend to become
exaggerated. As noted earlier, the sound is the space, and vice versa. In between these two
extremes lie the truly interactive acoustic spaces where a balance between intimacy or
envelopment is balanced with the needed sense of clarity and definition.
Vancouver’s spectacular natural setting and its dramatic layout of modern building, particularly
around the harbour area, provide strong visual imagery for the city, one that is used to attract
tourists. As documented on the Soundscape Vancouver 1996 CD [7], many of these visually
striking environments are accompanied by bland, technologically derived soundscapes. Sound
examples from the CD include the Seabus crossing the harbour, the noisy exhaust fans from the
architecturally striking Canada Place, and the bland drones and hums from Arthur Erickson’s
otherwise dramatic Museum of Anthropology with its marvelous collection of West Coast artifacts.
One wonders whether in these environments, the eyes take over and cause the ears to ignore
what is accompanying these visual splendours. Fortunately, there are also many examples of
planned urban re-development in the city which are designed on a more human scale, not unlike
the village model which the WSP encountered in its European study [8]. Examples of this
approach which have proved popular with the public are Granville Island, Gastown, many parts of
the West End, Commercial Drive and some other neighbourhood based town centres. In each of
these, acoustic space is controlled, at least to some extent, and populated by a wide variety of
human oriented sounds. Such information rich environments seem to create a positive model of
acoustic ecology.
In an age that seems intoxicated by “virtual reality”, we often assume that these artificial illusions
of space are the only ones, particularly as they acquire increasing degrees of realism. Even if
given less public profile, multi-channel and multi-speaker re-creations of acoustic space are just
as impressive, and more easily achieve the effect of total immersion, since it is relatively easy to
surround an audience with arrays of loudspeakers. Our work at Simon Fraser University over the
last decade has shown that this type of aural representation is particularly effective for creating
immersive acoustic environments, through what we call soundscape composition [8, 9].
One of the earliest multi-channel instalations occurred at the Brussels World’s Fair in 1958, where
Edgard Varèse’s multi-track work, Poème Eléctronique, was projected through 425 loudspeakers
attached to the curvilinear walls of the Corbusier designed Philips Pavilion (Figure 1). Early four-
channel formats (quadraphonic sound) doubled the number of possible sources, but could only
Author-6
create a coherent sense of space in a relatively small room because the distances between the
speakers left gaps in the spatial illusion unless a lot of reverberation was added. Today, the 8-
channel configuration works best for medium sized rooms, as long as the material on each
channel is kept uncorrelated, that is, as independent sources which is the norm in the acoustic
world. The spatial layout of these speakers can vary, but the choices are generally circular,
equally spaced around the audience, or more clustered in front of the audience, given that our
ability to localize is better in front than behind.
Larger, better equipped halls, have extended this principle to even larger numbers of channels
and speakers to which independent sources or tracks can be sent. The ZKM at Karlsruhe, for
instance, has a rig of 16 speakers, 8 that are elevated, and 8 around the audience. The new
Sonic Arts Research Centre in Belfast has an even more amazing array of up to 40 channels, two
sets of 8 that are suspended at varying heights, another set of 8 around the audience, and two
more sets of 8 beneath the audience but audible through the grid flooring. The acoustic panels on
the walls are also variable to add or omit reflecting surfaces. These arrays, both at the listener’s
ear level, and those that incorporate height and depth, are excellent for creating a vivid sense of
acoustic space that is totally immersive. With the flexibility and precision of digital control, the
composer can literally design a detailed acoustic space, and move the listener through it. It is not
an exaggeration to suggest that this approach creates a 3-dimensional “audible architecture”.
In my opinion, the key to designing such a space is to treat the loudspeaker as a point source,
and avoid the illusion of what are called “phantom images” that appear between the speakers but
collapse when the listener is not placed exactly between them. Just as we can distinguish
multiple sound sources in a soundscape (assuming their levels are balanced), so too can we hear
the definition of multiple speakers emitting different signals. When some of these channels
incorporate a similar sense of reverberation, ambience, or other environmental cues, then those
speakers will connect to form an ordered sense of acoustic space. Strategies exist for moving a
sound smoothly between speakers (or not), hence adding the possibility of moving sound
sources, and/or apparent movement of the listener through different acoustic spaces.
Simultaneous “streams” of sound images can also be created, though it is unclear as to how
Author-7
many a listener might optimally follow. The artistic potential of such immersive audio
environments is just beginning to be understood and put into practice.
Figure 2. The multi-channel theatre space at the Sonic Arts Research Centre, Belfast
6. Conclusion
In this paper, I have tried to give an overview of a concept of space that is not tied to the visual
domain, but rather is created by aural experience. Although the acoustic and psychoacoustic
principles on which it is based are mostly well known, our understanding of how humans create
their sense of acoustic space based on just two binaural inputs – what is generally termed
“auditory scene analysis” [4] – is still fragmentary, and until recently based mainly on speech and
music perception, not environmental experience in general where the variables are far more
complex. The “architecture” of such spaces can easily be related to the concerns of acoustic
ecology and acoustic design. Many involved in the field would say that the design concerns today
are increasingly pressing as the forces of technology and urbanization progress. Until recently,
one role of music has been to fill existing spaces, designed or otherwise, both to inspire and in
the case of “music as environment” (what used to be called “background music”) to manipulate
those not listening to it. I have suggested here that soundscape composition, as both a musical
and communicational form, can use sound to create acoustic spaces and thus draw attention to
our ongoing relationships to the real world.
7. References
[1] Truax, B., ‘Composition and Diffusion: Space in Sound in Space’, Organised Sound,Vol. 3,
No. 2, 1998, pp. 141-146.
Author-8
[2] Truax, B., ed. Handbook for Acoustic Ecology, CD-ROM, Cambridge Street Publishing, 1999.
www.sfu.ca/~truax/handbook.html
[3] www.sfu.ca/~truax/conv.html
[4] Bregman, A., Auditory Scene Analysis, MIT Press, 1990.
[5] Truax, B., Acoustic Communication. Norwood, NJ: Ablex Publishing, 1984. 2nd ed., 2001.
[6] Southworth, M., ‘The Sonic Environment of Cities’, Environment and Behavior, Vol. 1, No. 1,
1969, pp. 49-70.
[7] World Soundscape Project, The Vancouver Soundscape 1973/Soundscape Vancouver 1996,
Cambridge Street Publishing, CSR-2CD 9901. www.sfu.ca/~truax/vanscape.html.
[8] Truax, B., ‘Soundscape, Acoustic Communication & Environmental Sound Composition’,
Contemporary Music Review, Vol. 15, No. 1, 1996, pp. 49-65.
[9] Truax, B., ‘Techniques and Genres of Soundscape Composition as Developed at Simon
Fraser University’, Organised Sound, Vol. 7, No. 1, 2002, pp. 5-14.
Author-9