Understanding and Installing An Ambiophonic System: Les Leventhal Ralph Glasgal University of Manitoba

Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

Understanding and Installing an Ambiophonic System

Les Leventhal1 Ralph Glasgal2

University of Manitoba Ambiophonics Institute

September, 20083

Abstract: Recordings of music and film soundtracks contain cues used by the
ear/brain to localize sound. Home or studio reproduction using conventional
stereo, 5.1, 7.1, or 10.2 distorts these cues and creates false ones. The result is
localization distortion, which degrades horizontal and depth imaging of direct and
ambient sound, degrades clarity of instruments, colors the sound, and greatly
reduces size and depth of the sonic stage. Localization distortion can be reduced
to very low levels by a technology called Ambiophonics. Ambiophonics, at its
simplest, consists of crosstalk-cancelled playback by two closely-spaced, front
speakers. The result is that one can now hear at home what the recording
microphones hear – and what the microphones hear is greatly improved horizontal
and depth localization; solid, clear, three-dimensional imaging; less colored
sound; and a sonic stage that is very deep and very wide – at least 150 degrees –
compared to the 60-degree wide stage of the stereo equilateral triangle.
Ambiophonics does not artificially increase the width and depth of the stage.
Instead, it reduces localization distortion to such low levels that one can hear the
width and depth of stage that was actually recorded on the disc. Details are
discussed for setting up an Ambiophonic system with 2, 4, or 6 speakers.

First there was mono and then there was stereo. Using just one speaker, monophonic
reproduction sounds like all instruments are located at the speaker. Using two widely-space
speakers, stereophonic reproduction sounds like different instruments have different locations –
and the locations stretch from one speaker to the other. Having different locations for different
instruments is so highly valued that stereophonic reproduction – now almost 80 years old – and
its offshoots such as 5.1 and 7.1 have become standard in the home reproduction of music and
movies. But stereo and its offshoots do far from a perfect job of localizing sound and their
imperfections limit the quality, the believability, and the realism of the reproduction. A new
technology – called Ambiophonics – fixes most of the problems with stereophonic reproduction.
Ambiophonics is based on almost a century of psychoacoustics research on how the ear/brain
localizes sound. This historical research, combined with current research, tells us how

Les Leventhal, MA, PhD, is a professor of psychology at the University of Manitoba in
Winnipeg, Canada. He is a member of AES and the American Psychological Association.
Ralph Glasgal, BEP, MSEE, is the founder of the nonprofit Ambiophonics Institute. He is a
member of AES and IEEE.
Updated versions of this paper will be posted on the Ambiophonics web site:

conventional stereo destroys good sound localization and how to fix the problem. Ambiophonics
fixes most of stereo’s problems by using crosstalk reduction with two closely-spaced front
speakers – separated only 15 to 20 degrees. Yet the sonic stage created can be 150- to 180-
degrees wide! Just how this can happen is a wonderful tale of how laboratory findings can
produce unexpected and beautiful results.

In Part 1, we explain how stereo reproduction distorts the sound field and how
Ambiophonics restores it. In Part 2, we explain how to maximize the performance of an
Ambiophonics system. Those already familiar with the problems of stereo and the advantages of
Ambiophonics might skip to Part 2.

Part 1: Problems with Stereo Reproduction and How to Fix Them

Sound Localization

To understand how stereo distorts instrument localization and how Ambiophonics

corrects it, we need to understand first how the human ear/brain localizes sound. The ear/brain
uses three primary cues to localize sound: interaural loudness differences (ILD), interaural time
differences (ITD), and changes to the higher frequencies of the sound by the pinna, the curly
shell surrounding the ear canal (pinna localization cues).

Interaural Loudness Differences (ILD). With live music, if a violinist is playing a

violin in front of you, the loudness at both ears is about equal. If the violinist is standing on your
right, the violin sound in your right ear will be louder than in your left. Such loudness
differences at the two ears from the same sound source are a cue for the sound’s location. ILD
cues work well only for signals with energy below 1,000 Hz and above 90 Hz.

Interaural Time Differences (ITD). If the live violinist is not playing directly in front
of you but to your right, the violin sound will reach your right ear a little earlier than your left
ear. The reason is that the violin is just a bit closer to your right ear than your left. Such a time
difference between sound arrivals at the two ears from the same source is a cue for the sound’s
location. Like ILD cues, ITD cues work really well only for signals with energy below 1,000 Hz
and above 90 Hz.

Pinna Localization Cues. The frequency response of a live violin consists of a complex
pattern of frequency peaks and valleys. Before the violin sound enters your ear canal, it bounces
around the curls, cavities, and folds of your pinna (ear shell). Frequency components over about
1,000 Hz interact with these structures and the pattern of peaks and valleys changes enormously.
Moreover, the sound from a live violin located at your left will bounce differently around the
left-ear pinna than will the sound from the same live violin located directly in front of you.
(Actually, if the violin is located at just the correct spot on your left, it will have a direct shot at
your ear canal and pinna shadows and resonances become less important.) So the frequency
response of the live violin measured at the entrance your left ear canal changes with the violin’s
location. The brain uses these changes in response patterns as location-finding clues. Very
small horizontal changes in the violin’s location can produce changes so great in the pattern of
peaks and valleys that one might view the pinna as an exquisitely sensitive direction finder that
converts minute changes in the direction of incoming sound to overwhelming changes in the

frequency response pattern. Even a person with only one functioning ear has some ability to
identify the location of most natural sounds. The pinna locates both transient sounds, like clicks,
and continuous sounds.

You have two pinnae. For a given violin location, the pinnae create quite different
response patterns. The brain interprets each single ear pattern and possibly the difference
between the patterns as a location cue. When the violin is off center, the difference between the
patterns will be very large indeed. Move the violin a little to the left or right and the patterns and
thus the differences between them can change greatly. This location detector is so sensitive that
subjects can detect a change as small as one degree in the horizontal location of impulsive clicks
or speech sibilants.

If a music reproduction system is to produce the correct pinna cues when playing a violin
recording, then in theory the speaker reproducing the violin must be located where the violin is
supposed to be. If the live violin is on your left, the speaker reproducing the violin must be at
that same angle on your left. If the violin is supposed to be directly in front of you, as is the case
with most soloists, the speaker reproducing the violin must be directly in front of you. If the
speaker reproducing the violin is not where the violin is meant to be, then the pinna cues
produced by the speaker will be incorrect for the violin’s location. As we shall see below,
getting the pinna localization cues correct is a nasty problem for any music reproduction system.

How Stereo (and 5.1, 7.1, etc.) Messes Things Up

Conventional stereo (and its offshoots 5.1, 7.1, etc.) creates an illusion, akin to an optical
illusion, that does a fair job of localizing sound. But it does not do an excellent job. Consider a
typical home stereo system with the speakers and the listener forming an equilateral triangle –
that is, the speakers are separated by a 60-degree angle as viewed by the listener. This stereo
system does several things that prevent lifelike sound localization. More complex systems such
as 5.1, 7.1, etc. have the same problems. The problems are acoustic crosstalk, comb filter
effects, incorrect pinna cues, incorrect ILD and ITD cues, and inconsistent localization cues.

Acoustic Crosstalk. You are listening to a live violinist playing directly in front of you.
Both ears hear the violin. The sound at your left ear is similar to but not exactly the same as the
sound at your right ear. There are many reasons for the slight sound differences but they are not
important now. What is important is that the live violin has produced two versions, two
presentations, of the violin sound – one at your left ear and one at your right ear. This is OK
because your ear/brain has spent all its life learning to fuse two sound presentations such as this
into one image – so what you perceive now is a single, live violin. Now consider a typical stereo
recording of the same violinist. The recording is engineered so that the violinist will appear to be
located directly in front of you, halfway between the two speakers. To accomplish this, the two
channels of the recording will have similar loudness and will arrive at your ears at about the
same time. The problem is that your left ear hears both speakers and your right ear hears both
speakers – and the four sound presentations are not exactly alike in level, arrival time, or
frequency response. Your ear/brain now has four versions to fuse into a single violin. If your
left ear heard only the left speaker and your right ear heard only the right speaker, then your
ear/brain would be back in the familiar territory in which it must fuse just two presentations into
a single image. The trouble is that your right ear hears the left speaker and your left ear hears the

right speaker. This is called acoustic crosstalk, where each ear hears the speaker on the opposite
side. Your ear/brain did not evolve to deal with four presentations of the same sound source.

Crosstalk produces incorrect head shadows for center images, reducing the lifelikeness of
the image. Head shadow refers to the reduction of mid and high frequencies as sound travels
around and over the head to the far ear. When listening to a live instrument directly in front of
you, sound travels to the ears with only a small impact from the intervening fleshy part of the
face, that is, with only a small head shadow. In contrast, when stereo speakers play similar
signals to produce a center image, head shadow is large because sound from a speaker must
travel around much of the head to reach the far ear. The result is not only decreased lifelikeness
of the center image but also a tonal balance with less mid and high frequency energy reaching
the ears than in real life. One can eliminate side-speaker head shadow by eliminating side-
speaker crosstalk (difficult) or by moving the speakers close together and eliminating front-
speaker crosstalk (much easier). When speakers are moved together and front-speaker crosstalk
is eliminated, the tonal balance of center images will be flatter with minimal effect on
localization of side images.

In addition, crosstalk boosts the low bass in center images compared to side images,
resulting in a center stage tonal balance that differs from that of the side stage. When stereo
reproduces a center image, both speakers operate and each ear gets two versions of the sound –
direct sound from the near speaker and crosstalk from the far speaker. Both versions have
essentially the same low-bass level and phase. They have the same low bass level because the
head is not a significant barrier to crosstalk at low bass frequencies. They have the same phase
because the distance between the ears is trivial compared to the wavelength of low bass
frequencies. The low bass in the two versions sum, doubling the low bass at each ear compared
to the low bass at each ear when a single speaker, operating almost by itself, produces a side
image. With a side image, the close ear hears nearly unadulterated high, mid, and low
frequencies. The far ear, due to head shadow, hears mainly low-bass frequencies. This contrasts
with the double low bass at each ear of a center image. Moreover, the double low bass at each
ear of a center image is greater than the low bass in real life sound. This amounts to coloration
of center image sound by conventional stereo.

Some recording engineers know that stereo playback produces excessive bass in the
center stage and compensate by reducing the central (mono) bass when mastering the recording.
Other recording engineers do not. In either case, owners of TacT room correction processors can
easily adjust the bass response of their system to taste.

Music systems using 5.1 and 7.1 formats have even worse crosstalk than conventional
stereo because they use three front speakers: left, right, and center. If a recording is mastered so
that all three front speakers produce the sound of the same solo violin, your left ear will hear 3
speakers and your right ear will hear three speakers – creating a total of six different
presentations of the same violin plus lots of excess bass for the larger instruments. Pity the
unfortunate ear/brain that must deal with this chaos. Music systems that use three front speakers
are as fundamentally flawed as the original two-speaker stereo triangle. In 5.1 movies, however,
the center speaker is usually mono and mostly dialog so that crosstalk does not occur.

Comb Filter Effects. When two identical or correlated broadband signals are separated
by just an instant or two in time, the signals add together producing a single signal with a

changed frequency response. The new response can differ substantially from either of the
original signals. The change in response depends on several factors, one being the delay
between the signals. With the right delay, somewhere around .1ms to 1.0ms for audio, the
resulting signal, viewed on a scope, displays side-by-side peaks alternating with deep nulls –
resembling a comb. These are called comb filter effects, or combing. Unfortunately, acoustic
crosstalk from stereo speakers produces comb filter effects. Consider a conventional stereo
system playing a soloist located directly in front. Both speakers produce similar signals. The
left ear hears the left speaker and, about .22ms later, hears the right speaker. Comb filter effects
result at the left ear. This is an actual change in the frequency response of the signal from the
left speaker and it is caused by acoustic crosstalk. The same thing happens at the right ear when
it hears the right speaker and, .22ms later, hears the left speaker. Now consider a 5.1 or 7.1
system having three front speakers. The three speakers are at slightly different distances to the
left ear. Suppose the three speakers are all reproducing a soloist at stage center. The left ear first
hears the left speaker, then an instant later hears the center speaker, then an instant later hears the
right speaker. The same sort of thing happens at the right ear. Combing chaos! While the
combing is not audible as a change in frequency balance, it does signal the brain that the sound
source is not real. The music sounds canned, without depth or presence, or perhaps very slightly
grainy or fuzzy. This is why, in 5.1 mastering, the center speaker is almost always mono dialog
or a mono spot mic’d soloist.

Incorrect Pinna Cues. Again consider the conventional stereo system arranged in an
equilateral triangle. The system is reproducing a soloist at stage center. Although the soloist is
located at stage center, the speakers are located at stage left and stage right and are therefore
producing pinna cues appropriate only to sound sources located near the sides of the stage.
These are the wrong pinna cues for the soloist at stage center. They are incorrect cues, false
cues. In order for the pinna cues to be correct, the center-stage soloist should be reproduced by a
center speaker. However, when an instrument located 30 degrees left of center is being
reproduced by the left speaker, the pinna cues are consistent with that location of the sound
source. Hence, with conventional stereo, pinna cues are correct for sound sources exactly 30
degrees off center but incorrect for sound sources near the center of the stage (or greater than 30
degrees off center). This is unfortunate since soloists are usually recorded near the center of the
stage and the largest portion of musical instruments is usually recorded closer to the center of the
stage than to the sides. In 5.1 and 7.1 systems, if the recording is engineered so that the left
speaker reproduces left sound sources, the center speaker reproduces center sound sources, and
the right speaker reproduces right sound sources, then the pinna cues would be correct. But
acoustic music recorded with microphones (as opposed to computer-generated sound) for 5.1 and
7.1 cannot be segmented this way effectively. Movies are often made this compartmentalized
way – resulting in essentially 3- or 5-channel mono except when background music is present.

Incorrect ILD and ITD Cues. When conventional stereo tries to reproduce a center
image, one would think that the ILD (interaural loudness difference) and ITD (interaural time
difference) cues from the speakers would be correct. After all, the speakers need only produce
equal loudness and equal time delays – ILD and ITD values of zero. Stereo speakers can easily
produce ILD and ITD values of zero. The problem is that stereo speakers also produce crosstalk
and crosstalk creates additional, unwanted ILD and ITD cues that are incorrect for a center
image. Consider a singer recorded at stage center. The reproduced sound from the left speaker
arriving at the left ear has the same loudness and time delay as does the reproduced sound from
the right speaker arriving at the right ear. The ILD and ITD values are both zero, which are

correct for a center image. So far, so good. Unfortunately, the sound of the right speaker
arriving at the right ear is followed about .22ms later by the crosstalk sound of the right speaker
arriving at the left ear – creating an ITD cue of .22ms. The same sort of thing is happening at the
other ear, creating a reverse ITD cue of .22ms. (One can also think of these as bogus high level
early reflections impinging on each ear.) The ITD cues are actually +.22ms and -.22ms, meaning
that the left speaker sound precedes the right by .22ms for one cue and that the right speaker
sound precedes the left by .22ms for the other cue. Crosstalk is creating ITD cues of +.22ms and
-.22ms and if you want to create a center image you do not want to be creating incorrect ITD
cues of ±.22ms that accompany the correct ITD cue of zero.

Crosstalk can also produce incorrect ILD cues for center images. The crosstalk sound of
a speaker arriving at the opposite-side ear is only slightly softer (due to traveling around the
head) – and it is delayed. The delay is equivalent to a phase shift that varies with frequency.
Hence, when the direct and the delayed signals combine at an ear, the level will change with
frequency. If the two signals at the left ear are not identical to the two signals at the right ear –
because the listener is not centered (or the speakers suffer from unequal frequency response or
unequal axial response) – a nonzero ILD value will be sensed and a sound’s location may seem
to drift depending on the instrument or the note being played. So, for a center image,
conventional stereo creates two false ITD cues and a false, unstable ILD value. If the crosstalk
were eliminated, these ITD/ILD errors would be eliminated. Hence, ILD and ITD cues delivered
by the stereo triangle are not like those in real life or in crosstalk-free reproduction

In a similar fashion, equilateral stereo creates false ILD/ITD cues for all other stage
locations except for sound sources located exactly 30 degrees off center, right where a speaker is
physically located. At this location, equilateral stereo produces correct ILD and ITD cues. To
make the sound source appear 30 degrees off center, essentially only one speaker produces
sound. This amounts to mono reproduction – with the speaker located where the sound source is
supposed to be. At 30-degrees off center, all localization cues – ILD, ITD, and pinna cues – are
correct for stereo. For any other stage location, stereo produces ILD/ITD chaos and incorrect
pinna cues. One audible effect of the incorrect ILD/ITC cues is to reduce the width of the stereo
stage. The 60-degree stage width you hear with stereo is narrower than the stage width actually
captured by the microphones. Hence, if we could eliminate the crosstalk, the reproduced stage
would widen considerably.

When recording microphones hear an extreme side source beyond the 30-degree angle –
for example, at 90 degrees – they record the source with ITD and ILD cues appropriate to the
extreme angle. But side images in equilateral stereo cannot be localized beyond the 30-degree
angle. Without special HRTF computer processing, the recorded ITD/ILD cues for extreme side
sources are not deliverable – for two reasons: First, the 30-degree speaker location limits the
maximum ITD to 220 microseconds – compared to an ITD of 700 microseconds produced by a
90-degree live side source. Second, the ILD value is too small at 30 degrees where both head
shadow and maximum ILD are smaller than at 90 degrees. Finally, pinna cues at the 30-degree
speaker location are incorrect for an extreme side source. Hence, extreme side sources get
folded inward and get lumped together at the 30-degree position where the speaker is located.

Inconsistent Localization Cues. Sound localization depends on ILD cues, ITD cues,
and pinna cues. There are two kinds of pinna cues: (1) cues based on the response of a single
pinna by itself to a sound event and (2) cues based on the responses of both pinnae. Everyday

experience shows that sound localization is better when both pinnae are used but it is not yet
clear how the brain makes use of the two pinna responses. Nevertheless, if a sound reproduction
system is to provide excellent sound localization, it must reproduce all the localization cues and
the cues must provide consistent information about the direction of a sound source. The
reproduced sonic image will seem less realistic if some cues say that the source is up front and
other cues say that the source is at your side. Yet this is exactly what stereo and its offshoots do.
We have seen that stereo provides incorrect pinna cues for a center-stage instrument – because
the sound is actually coming from side speakers. Stereo speakers that are 30-degrees off center
provide correct pinna, ILD, and ITD cues only for instruments that are exactly 30 degrees off
center. For a center-stage instrument, stereo speakers provide corrupted ILD/ITD cues (because
of crosstalk) and incorrect pinna cues (because speakers are located at the sides). All of the
localization cues have problems – and they are not even consistent with each other. Your
ear/brain did not evolve to deal with this jumble of inconsistent localization cues. This
inconsistency degrades the clarity and realism of all instrument locations except at 30 degrees.
Some listeners have difficulty detecting stable central phantom images – the images jump left or
right. If 5.1 or 7.1 recordings were mastered so that the center speaker alone reproduced center-
stage instruments, and side speakers alone reproduced side sounds, then all localization cues
would be consistent and correct. But this would amount to 3-channel mono and acoustic music
does not usually benefit from being mastered this way.

How Crosstalk Reduction with Closely Spaced Front Speakers Fixes Most Problems

Crosstalk Reduction. Electronic circuits that reduce crosstalk in stereo speakers have
been available for decades. Over the years, they have grown in the precision and sophistication
of their design, resulting in more complete crosstalk reduction and fewer unpleasant side effects.
Ambiophonics employs a laboratory-grade crosstalk reducer called RACE (Recursive
Ambiophonic Crosstalk Eliminator). In addition to PC versions, RACE has recently become
commercially available in certain products of TacT Audio, such as the TacT Ambiophonics
digital processor, TacT 2.2 XP, and TacT TCS. All crosstalk reduction circuits are based on the
same principle: Sounds from the right speaker are cancelled at the left ear by a carefully timed,
180-degree out-of-phase cancellation signal launched by the left speaker. Sounds from the left
speaker are cancelled at the right ear by a carefully timed, 180-degree out-of-phase cancellation
signal launched by the right speaker. If crosstalk cancellation is successful, then the left ear
hears only the left speaker and the right ear hears only the right speaker. The cancellation has
usually been done at a very broad range of middle frequencies, the size of the range being
adjustable in the more sophisticated cancellation circuits such as RACE. The timing of the
cancellation signals is quite precise and assumes that the speakers are equidistant from the
listener. The four presentations of a sound source that result from conventional stereo are now
reduced to the two presentations we experience when listening to live music. Crosstalk
reduction flattens the frequency response of both the center and side stage. Stereo crosstalk
produces ILD/ITD chaos for any location except at the speakers. This narrows the stage
considerably compared to the width heard by the recording microphones. Crosstalk reduction
eliminates ILD/ITD errors and restores the wide stage heard by the microphones – even when the
speakers are moved close together.

Closely-Spaced Front Speakers. RACE crosstalk reduction works best with front
speakers spread roughly 10-26 degrees apart. The front location of the speakers minimizes the
generation of false head-shadows – for example, when side speakers produce a center image –

and allows the speakers to produce correct pinna cues for instruments in the center third of the
stage. Moreover, closely-spaced front speakers – together with crosstalk cancellation – produce
correct ILD and ITD cues for the entire stage. In contrast, stereo-triangle speakers produce
correct ILD and ITD cues just at the 30-degree points where the speakers are located. Thus,
close speaker spacing and frontal location produce correct ILD, ITD, and pinna cues for the most
important sound locations: the central third of the stage, where soloists and most of the
instruments are usually located. Put differently, for closely-spaced front speakers, sound
localization cues are consistent (and correct) for the central third of the stage. For conventional
stereo, however, localization cues are inconsistent except at the 30-degree locations, where they
are consistent (and correct).

As discussed above, stereo produces incorrect head shadows for a center image – a
serious problem. One essentially eliminates this problem by moving the speakers close together.
Head shadows are small for close, front speakers, since the sound moves across only a small part
of the face on its way to the far ear. Unfortunately, the small head shadow produced by close
speakers is incorrect for side sources – for example, if you listen to a live violin on your extreme
right, your left ear will hear greatly diminished mid and high frequencies. But it is better for
head shadow to be correct for center images than side images and, as we shall see, there are ways
to provide the normal head shadow for side images without affecting the quality of the central

For side sources, closely-spaced front speakers driven by RACE produce essentially
correct ILD/ITD cues. This is true even for extreme side sources more than 30-degrees off
center. In contrast, equilateral stereo, because of crosstalk and problems with head shadows,
presents a muddle of ILD/ITD cues (and incorrect pinna cues) for extreme side sources. While
some recording engineers attempt to compensate for this, the fact that the overwhelming majority
of discs image widely when reproduced Ambiophonically suggests that this defect is not easily

Although RACE-driven closely-spaced front speakers correctly deliver recorded ILD

and ITD cues for the entire stage, pinna cues will be incorrect for side sources. In addition, as in
stereo, head shadows may be missing for side images when side sources are recorded on the disc.
Fortunately, sources at the extreme sides are less common in music than central sources.
Moreover, since side sources have a direct shot at the ear canal, the brain depends less on pinna
cues for side sources than on ILD/ITD cues. Getting pinna cues correct for the entire stage is a
nasty problem, but closely-spaced front speakers do a far better job than do stereo-triangle

One can supplement closely-spaced front speakers with side speakers to produce correct
pinna cues and correct head shadows for sources at the extreme sides. As a result, sources at the
left and right 90-degree points will appear routinely when called for in the recording. Installing
side speakers will be discussed in Part 2.

Stereo crosstalk produces comb filter effects. With equilateral stereo, combing can start
below around 1,000 Hz. As the speakers are moved closer together, the start of the combing
moves up in frequency. When speakers are as close as Ambiophonics specifies, combing occurs
at such high frequencies that it is either inaudible or virtually inaudible.

Close-spacing of front speakers is the real innovation of Ambiophonics over previous

crosstalk-reduction technologies. Stereo’s speakers produce head shadows which, like
fingerprints, vary greatly across people. Hence, effective crosstalk elimination is just not
possible for side speakers. (Recall, head shadow consists of mid and high frequency losses as
sound travels around the head to the far ear. To cancel a signal at the far ear, the cancellation
signal must be programmed with the same frequency response as the signal it is meant to cancel.
This is not possible since head shadows vary greatly.) Since effective crosstalk elimination is
not possible for side speakers, side-speaker head shadows will always interfere with center
images. If one moves speakers close together to eliminate side-speaker head shadows, it is still
desirable to eliminate crosstalk, that is, to make the left ear hear only the left speaker and the
right ear hear only the right speaker. Fortunately, head shadow from a front speaker is so slight
that it can be ignored by the crosstalk cancellation software. The software need consider only
the delay and attenuation as sound goes from a front speaker to the far ear – and this is quite
doable. Indeed, with RACE crosstalk reduction, the user adjusts delay and attenuation until the
widest stage is heard. Crosstalk cancellation is then maximum for that user. Thus, close-spacing
of front speakers makes effective crosstalk cancellation possible. Moreover, close spacing
eliminates side-speaker head shadows, satisfies the pinna for the center stage, gets ITD and ILD
cues correct for the entire stage, greatly reduces or eliminates audible combing, and eliminates
stereo’s unconvincing center imaging. And now that the speakers are close together, you
absolutely need crosstalk cancellation or the stage will be 20 degrees wide, if that!

An additional benefit of closely-spaced speakers is that they can be far from the side
walls of the listening room. If a speaker is too close to a side wall, the delay between the direct
sound from the speaker and the first reflection off the side wall that hits the listener may be short
enough to produce comb filter effects. Moving speakers away from the walls can reduce
combing. Reducing combing from side wall reflections and increasing the delay of side wall
reflections can improve imaging and reduce coloration of the sound – although side-wall
reflections have proven to be less harmful to Ambiophonics than to conventional stereo.

One might view Ambiophonics this way: Conventional stereo and its offshoots 5.1 and
7.1 suffer from acoustically-produced localization distortion. Ambiophonics is designed to
greatly reduce this distortion. Ambiophonics not only lets you hear the music as it was actually
recorded but as it would have sounded if you were at the main microphone location during the
performance. You can enjoy the same wide-angle perspective as the main microphones or a
first-row center concert goer.

Audible Benefits of Ambiophonics

A large number of localization errors created by conventional stereo reproduction are

described above. Unfortunately, the current state of the art cannot order them according to the
size of the damage they do to the reproduced sound. But as a collection, they do great damage.
Perhaps the strongest evidence of the importance of localization errors is that Ambiophonics,
whose theory of design is based on reducing localization errors to very small level, has an
enormous audible effect and in the direction predicted by the theory: Removing localization
errors increases (not decreases) the width and depth of the stage. Glasgal’s Tonmeister
Symposium paper (2005) shows how to make such predictions. Whether one likes the effect or
not, it provides evidence supporting the theory behind Ambiophonics.

Perhaps the most striking effect of Ambiophonics is that the reproduced stage is
significantly wider and deeper than the stage reproduced by equilateral stereo speakers. The
Ambiophonic stage extends way to the left of the left speaker, way to the right of the right
speaker, and is very deep. The stage is typically around 150-degrees wide and, under optimal
conditions, can extend to 180 degrees. To anyone accustomed to conventional stereo, which
limits the stage to the 60-degree angle between the speakers, Ambiophonics seems like magic.
Ambiophonics does not artificially increase the width and depth of the stage. Instead, it reduces
localization distortion to very low levels so that you can hear the width and depth of stage that
was actually recorded. There is much more localization data stored on ordinary LPs and CDs
than is recoverable using the stereophonic loudspeaker triangle.

A 150-degree stage is very wide. This does not mean that the instruments stretch across a
150-degee arc. The musical group can occupy a much smaller space but the reverberant field
will stretch to 150 degrees even though there are no surround speakers. This means that musical
groups play their music in a much larger reverberant field, just like in a concert hall. A concert
hall, however, creates a 360-degree reverberant field. Ambiophonics now offers RACE for four-
speaker (4.x) systems – two speakers up front and two speakers in the rear – that will reproduce a
360-degree reverberant field. TacT Audio has incorporated this software in new products. The
4.x methodology, with an option to add side speakers (6.x), is described in Part 2 below.

Listening Position. In stereo, if one moves closer to the speakers, a hole in the middle
eventually appears and one just hears the side speakers individually. If one moves away from
the speakers, the stage narrows until it is mono. In contrast, if one gets too close to the speakers
in an Ambiophonics system, the sound becomes ordinary stereo and, as one moves back even
several yards, the stage remains fully wide with very little change. It is thus not unusual to have,
say, six listeners along the sweet line enjoying the same wide stage

Imaging. Ambiophonics pays so much attention to getting the spatial localization cues
correct and consistent that it is no surprise that the imaging is better than conventional stereo.
Horizontal imaging is more precise and so is depth imaging. Depth imaging refers to the
realistic depiction of some instruments as far away and others as close. If depth imaging is
precise, you can hear when one instrument is just in front of another. This is sometimes called
layered depth. Depth imaging in conventional stereo is typically poor. A single instrument may
have a reasonably precise horizontal location but the depth location is smeared. As a result, an
instrument’s image can sound one foot wide but can appear to extend vaguely somewhere from
10-14 feet ahead of you. The image is shaped something like a needle pointing at you, not a
point in space. The excellent depth imaging in Ambiophonics reduces the 10- to 14-foot needle
to almost a point. Perhaps because of the better horizontal but especially because of the better
depth imaging, listeners often report that a single instrument sounds more well-defined, more 3-
dimensional, and more palpably real in Ambiophonics than in stereo.

When many instruments play together, listeners often report that Ambiophonics produces
greater clarity and greater realism of the group forces than stereo. The increased subjective
clarity probably results from the individual instruments being so well-defined spatially. One can
easily focus on a single instrument or singer in the group. In everyday hearing this is known as
the cocktail party effect. An audiophile hearing such detailed reproduction of large instrumental
groups might think that Ambiophonics reduces harmonic or intermodulation distortion. In fact,
Ambiophonics reduces localization distortion.

Even with excellent speakers, closely matched in frequency response, room acoustics
may affect one speaker differently than the other. Hence, measured at the listening spot, the
frequency response of one speaker may differ greatly from the other. This will degrade and
smear imaging. But if the speakers are located close together, as they are in Ambiophonics,
room acoustics are likely to affect both speakers identically. This benefits imaging. Owners of
TacT room correction devices that include RACE crosstalk reduction software have an additional
benefit. The room correction function does work individually on each speaker, but their
proximity helps make their frequency responses more similar at the listening position.

Imaging Empty Space. Consider a string quartet recorded in a hall. Sound reflections
from room surfaces fill the hall with ambience. So the space between and around the
instruments is not empty but filled with reverberation. Like the direct sound of an instrument,
reverberation is a sonic event that can be imaged poorly if localization cues are incorrect. We
just do not think of reverberation as a sonic event to be imaged because it is so diffuse and
unfocused. Nevertheless, the 60-degree wide reverberant field of equilateral stereo expands to
150- to 180-degrees with Ambiophonics. Moreover, some recordings played back in stereo have
a reverberation problem in which a soloist is surrounded by a halo of reverberance that is audibly
denser and louder than the rest of the reverberant field. Ambiophonics distributes this halo and
the field becomes more homogeneous. This suggests that localization distortion can audibly
degrade both the sound of instruments and the reverberant field surrounding them.
Ambiophonic’s reproduction of vivid, three-dimensional instruments may result not only from
the reduced localization distortion of the direct sound of the instruments but also from the
reduced localization distortion of the frontal reverberant field. Both are now being delivered as
they were heard by the recording microphones.

Ambiophonics produces a “you-are-there” experience rather than “they are here.” It

transports you to the recording site whereas conventional stereo seems to transport the
instruments to your listening room. Listen to Ambiophonics long enough for your ear/brain to
accommodate to the larger stage, improved imaging, and greater clarity. Listen long enough to
get used to the sound. Then press the button that returns the sound to conventional stereo. The
shock of the change will be like a slap in the face.

Conclusions. The reduction of localization distortion by Ambiophonics is such a

profound change in sound reproduction that it can be easily heard on the most modest audio
equipment. Indeed, if you use RACE crosstalk reduction on music reproduced by the 1-inch
speakers built in to your laptop computer, you will hear the Ambiophonic stage. So large is the
change produced by Ambiophonics that one accustomed to conventional stereo should allow the
ear/brain several days to accommodate to Ambiophonic sound before evaluating it. And
remember that Ambiophonics reduces the artificial bass boost produced by conventional stereo.
Hence, Ambiophonics will have leaner and more lifelike bass on most recordings than stereo.
Owners of adjustable subwoofers or TacT room correction processors can easily adjust bass to

Recording engineers should use Ambiophonics with their studio monitoring speakers if
they want to hear what the microphones hear. Just like home speakers, studio speakers used in a
stereo triangle produce such localization distortion that an engineer cannot hear it when his

microphone placement for a piano produces a piano that sounds 70-feet wide when reproduced
on a system without localization distortion.

We are fortunate to have a legacy of a half century of stereo recordings – a cultural

treasure. Stored on those CDs and LPs is a wealth of localization information that we cannot
hear until we use playback systems having very low levels of localization distortion. Converting
one’s system to Ambiophonics will provide the opportunity to listen to old friends with new ears.

Crosstalk-Reduction Circuits/Software

When a crosstalk-reduction device launches a cancellation signal from the left speaker to
cancel the right-speaker signal at the left ear, the cancellation signal is unfortunately also heard
by the right ear. Early crosstalk reducers, such as the Carver/Sunfire Hologram and Lexicon’s
Panorama, were satisfied to cancel the right-speaker signal at the left ear and to ignore the fact
that the cancellation signal was heard by the right ear. Current laboratory-grade crosstalk-
reduction devices such as RACE are designed to cancel the cancellation signal arriving at the
right ear with a cancellation signal launched by the right speaker. Then the cancellation signal
launched by the right speaker is heard by the left ear and must be cancelled at the left ear. And
so on. A crosstalk reducer designed to cancel all the cancellation signals is called recursive.
RACE is recursive. A cancellation signal is cancelled by a signal that is about 2.5 dB softer
when launched by the speaker. In RACE, this value is adjustable. The cancellation of
cancellation signals could go on forever. But the amplitude of the cancellation signal decreases
at each step and is finally terminated by RACE when the digital value of the sample is all zero.
It is usual in concert halls to refer to reverberation time, which is the time it takes for the
reverberation to fall 60 dB after a single musical note is terminated. 60 dB down is almost
inaudible. Humans can easily detect differences in reverberation time and the same applies to
crosstalk cancellation. The cancellation must continue until the cancellation signals are no
longer of psychoacoustic significance.

RACE PC configurations can be downloaded without charge from the Ambiophonics

web site:


Part 2: Installing 2-Speaker, 4-Speaker, and 6-Speaker Ambiophonic Systems

The basic Ambiophonics system consists of two closely-space front speakers driven by
RACE crosstalk-reduction software. The two speakers together are called an Ambiodipole.
Virtually any type of speaker can be used to form an Ambiodipole. Like stereophiles,
Ambiophiles can have preferences for particular speakers. There is never a need for a center
speaker. In fact, a center speaker substantially degrades performance. Ambiophonics can be
used with four speakers, two in front and two in back. One can add side speakers to the four-
speaker system resulting in a 6-speaker Ambiophonics system. The present instructions will
cover 2-speaker, 4-speaker, and 6-speaker systems.

RACE can be downloaded to a PC without charge from the Ambiophonics web site or
obtained in commercial products from TacT Audio. Instructions for both will be provided
below. Over the years, RACE has been improved and the present installation instructions apply
to all the versions currently on the web site and to all current TacT Audio products that include
RACE. RACE software in TacT products is engaged when the product is set to the XTC
(crosstalk cancellation) mode.

2-Speaker Systems

Speaker and Room Set Up. The two speakers forming the Ambiodipole should be
located about 10-26 degrees apart measured between the midrange drivers. Audible differences
will be small as speaker width is varied within this range and the user should select what sounds
best. On the other hand, the usual stereo equilateral triangle speaker spread of 60 degrees will
seriously degrade the sound. Wider spreads are even worse. The speakers can flank a TV
monitor and a center speaker is never required. Just to get it done, simply place the speakers 15-
20 degrees apart and go on to the rest of the set up. You can always experiment later with a
different speaker angle. The speakers should be the same distance to the listener measured to the
nearest inch, if possible.

While room treatment is always a good idea, Ambiophonics is much less damaged by
room reflections than conventional stereo or 5.1. One reason is that the delays involved in
RACE are less than 100 microseconds whereas even a very early room reflection is delayed by
milliseconds. The richness of the localization cues provided by Ambiophonics swamps the
effects of most room reflections. This is similar to a concert hall, where the richer, longer-lasting
hall ambience masks the clutter of short-delay reflections from seat backs and heads. So room
treatment is always desirable but not critical in the case of Ambiophonics and even less so in
Panambiophonics (below).

Directional speakers produce fewer room reflections than speakers with wide dispersion.
So directional speakers are well suited to Ambiophonic (and stereophonic) reproduction.
Nevertheless, wide-dispersion speakers can produce excellent results.

TacT owners wishing to use XTC will also be doing room correction. When doing room
correction measurements, XTC is automatically disabled so it makes no difference whether the
XTC mode is on or off.

RACE Adjustment Parameters. When a stereo speaker produces sound, the close ear
hears it first and, with some time delay and attenuation, the far ear hears it as crosstalk. The
amount of delay and attenuation depends on the angle to the speakers and the distance between
one’s ear canals. Since crosstalk cancellation requires the cancellation signal to be the correct
magnitude and to arrive at an ear exactly when the crosstalk does, one must be able to adjust the
crosstalk canceller to the particular attenuation and delay required by the speaker angle
employed and, less importantly, to the size of the listener’s head. For both RACE downloaded to
a PC and TacT RACE, there are three user adjustments:

1. Delay. Delay represents the time difference in microseconds (µs) between a sound’s
arrival at the near ear and its arrival at the far ear. The range of adjustment in RACE

PC or TacT is from 20µs to 210µs. The front panel display for TacT’s XTC mode
measures delay in milliseconds, not microseconds. A millisecond is 1/1,000 second.
A microsecond is 1/1,000,000 second. There are a thousand microseconds in a
millisecond. Hence, the range of adjustment shown on the TacT display is from .02
to .21 milliseconds, abbreviated by TacT as .02 to .21 msec. 70µs (.07 msec) is
average for most installations. Another way to view Delay is that a Delay of, say,
70µs means that when a speaker launches a signal to be cancelled at the far ear, the
speaker near that ear will launch the cancellation signal 70µs later. A Delay value
that is correct for a listener sitting centered will be incorrect if the listener moves off
center. But there is no need to change Delay with reclining, head rotation, nodding,
tilting, normal forward and back motion along the center line, or additional seating
along the center line.

2. Attenuation (Spread Factor). Attenuation represents the level loss, in dB, between
the near and far ears. (The level loss is due to longer path and facial absorption.)
RACE for the PC shows an adjustment range from -1.5 dB to -4 dB. It is easier if one
drops the minus sign and thinks of these values as positive. We will refer to the
Attenuation range as 1.5 dB to 4 dB. An Attenuation of 1.5 dB means that there is a
1.5 dB loss in level between the near and far ears. In the TacT implementation of
RACE, Attenuation is called Spread Factor. Spread Factor uses a different unit of
measurement. Spread Factor values range from 0-100 and are inversely related to
Attenuation values. The following table shows the conversion between Spread Factor
and Attenuation.

Spread Factor Attenuation

100 0.5 dB
90 1.5 dB
80 2.5 dB
70 3.5 dB
60 4.5 dB
50 5.5 dB
-- --

Each unit of Spread Factor = 0.1 dB. The range of Spread Factor adjustment, 0-100,
has a range of 10 dB. This corresponds to an Attenuation range of .5dB to 10.5 dB.
A Spread Factor of 80 corresponds to an Attenuation of 2.5 dB. To convert Spread
Factor to Attenuation, move the decimal point in the Spread Factor left by one digit
(80 becomes 8.0) and subtract the value from 10.5. RACE PC does not provide
Attenuation values higher than 4.0 dB. Too low an Attenuation, 0.5-2.0 dB (Spread
Factor 85-100), produces unpredictable effects such as buzzing or abnormal
localization. A Spread Factor between 80-85 (Attenuation between 2.0-2.5 dB)
usually produces the best results. Another way to interpret Attenuation is that an
Attenuation of, say, 2.5 dB means that when a speaker launches a signal to be
cancelled at the far ear, the speaker near that ear will launch a cancellation signal that

is 2.5 dB softer. (The signals will have the same level when they arrive at the far

3. Algorithms. RACE crosstalk reduction operates between DC and 20,000 Hz. But
this range can be reduced by the control called Algorithms. RACE software in the
TacT 2.2 XP provides 10 algorithms. The following table shows, for each algorithm,
the frequency range in which crosstalk reduction operates.

Algorithm Operating Range Algorithm Operating Range

A-1 DC - 20,000 Hz B-1 DC - 20,000 Hz

A-2 200 - 9,000 B-2 200 - 20,000
A-3 300 - 9,000 B-3 300 - 20,000
A-4 400 - 9,000 B-4 400 - 20,000
A-5 500 - 9,000 B-5 500 - 20,000

A-series algorithms affect both the high- and low-frequency limits of the operating
range. B-series algorithms affect only the low-frequency limit. A-1 and B-1 are
identical in effect. Neither restricts the operating range. Having A-1 in the A series
facilitates comparing full range operation to the other A settings. Having B-1 in the B
series facilitates comparing full range operation to the other B settings. The
algorithms in the PC version and the TacT version of RACE are identical in concept
but RACE PC provides fewer algorithms.

How to Use the Three Adjustments. Even more than stereo, Ambiophonics is a tweak-
and-listen enterprise. The basic principle is this: Normally all tweaks and changes in settings
will be heard only as slight changes in the width of the stage. When the stage is at the widest,
crosstalk reduction parameters have been optimized for the conditions at hand and crosstalk
reduction is at its maximum. But you may prefer less than maximum crosstalk reduction for
certain recordings or to suite your taste. Begin with the following settings and experiment from
there: Delay = 70-80µs (TacT Delay = .07-.08 msec), Attenuation = 2.3 dB (TacT Spread Factor
= 82), and Algorithm B-1. Neither Delay nor Attenuation (Spread Factor) settings are all that
critical if not pursued to excess.

RACE assumes that the sound from a speaker arrives at the far ear with a fixed and
predictable delay and loss in level. These two values depend on the angle to the speakers and the
size of one’s head. As the angle to the speakers gets wider or one’s head gets larger, the
attenuation gets larger and the delay gets longer. With the speakers about 20-degrees apart
(measured from the midrange drivers), a Delay of about 70-80µs is usually correct for most
people and speakers. Changing the Delay to, say, 60µs or 90µs will probably not be audible.
You should experiment as described below to get the stage you like. Attenuation is usually
correct when set to around 2.3 dB (Spread Factor around 82) but again try varying it to get the
widest stage. If you have a recording of a string quartet and the violin and cello appear to be 200
feet apart, consider changing Delay and Attenuation (Spread Factor) settings for this recording.
The 200-foot wide quartet or a 70-foot wide piano indicates that the recording contains interaural
loudness and/or time difference cues, picked up by the microphones, which the recording

engineer could not hear while mastering because monitoring was done in conventional stereo.
Nevertheless, you are hearing what the microphones heard even if the engineer did not.

Delay and Attenuation (Spread Factor) adjustments are useful for adjusting the width of
the stage. They have the same audible effect, but both must be used. Normally, one changes the
two controls about the same percentage. Imagine that Delay and Attenuation are controlled by
two close knobs. You connect the knobs with a rubber band so that when you turn one, the other
turns similarly. But feel free to experiment. To achieve the maximum stage width, play a test
track or find a source (like an LP) that can be connected so there will be a signal on just one of
the two stereo channels. Play the track with XTC engaged and then move closer to and farther
from the speakers along their center line until the source is as far to the side as you can get it.
That is the best listening position – the maximum crosstalk reduction – given the speaker angle,
head size, and Delay and Attenuation settings used. Repeat the single-sided experiment with the
other side, checking to see that the audible stage is symmetrical. A nonsymmetrical stage means
that the speakers have different distances to the listening location or are not identical in level or
frequency response. It may be faster to use a program source with both channels operating while
you adjust settings for maximum stage width. But you are assuming, perhaps incorrectly, that
the recording was mastered to have a symmetrical stage. If you are hearing the widest possible
stage but interior decoration requires moving the listening location farther from or closer to the
speakers, move the listening chair to where you want it and change the distance between the
speakers to maintain the same speaker angle used at the original listening location. If the speaker
angle is the same at the new listening location, you will not have to change the Delay and
Attenuation settings. If you do not change the distance between the speakers, then moving the
listening chair has, in effect, changed the speaker angle and you must change the Delay and
Attenuation settings. Delay/Attenuation settings and speaker angle must jive – that is, for a
given speaker angle (and head size), only one Delay and one Attenuation setting will produce the
widest stage. If you change the speaker angle, you must change Delay and Attenuation. If you
change Delay and Attenuation, you must change the speaker angle. If Delay and Attenuation are
correct for the speaker angle, you will have the longest sweet centerline coupled with the widest
stage. This means that listeners can sit in front of you and behind you and they will hear the
same stage you hear. If your head suddenly shrinks to the size of a child’s head, you can reduce
Delay/Attenuation without moving the speakers or reduce the speaker angle without changing
Delay/Attenuation. Either way, play with the adjustment until your new head hears the widest
stage. Within reason, Delay/Attenuation settings are not critical.

Be careful with speaker controls or equalizers. When RACE calculates cancellation

signals, it assumes that the speakers have identical frequency response at the listening location.
If the speakers are not identical, the cancellation signals will be incorrect. So the speakers must
have the same response at the listening location and the location must be equally distant from the

As an example of using Delay and Attenuation (Spread Factor), you might decide to
flank your new TV screen with your speakers and find that the angle between the speakers is
wider than before, perhaps 25-30 degrees. Adjust Delay and Attenuation, changing them by
similar percentages, to obtain the widest stage. For speakers 25-30 degrees apart, try the larger
Delay/Attenuation settings first. On the other hand, if speakers are very closely spaced, again
use Delay and Attenuation to produce the widest stage. When the stage is at its widest, crosstalk
cancellation is at its maximum. There is no theoretically ideal angle for the speakers within

reason (within about 10-26 degrees). But the 60-degree angle used in conventional stereo is far
too wide and will seriously degrade performance. RACE control settings cannot compensate for
such a wide angle. Moreover, pinna errors will return.

In some versions of RACE, including TacT’s implementation of RACE, it is possible to

restrict the range of frequencies in which crosstalk reduction operates. This is done with the
Algorithms setting. At one time, it was thought necessary to restrict the frequency range of
crosstalk reduction. This is no longer the case. Suppose RACE is allowed to operate at low bass
frequencies, say below 90 Hz. Below 90 Hz, there is no meaningful attenuation, no meaningful
bass loss, as sound travels around the head from the ear closest to the speaker to the far ear. the
bass boost of the stereo triangle. Hence, what RACE does when it operates below roughly 150
Hz is acoustically meaningless. It is also acoustically meaningless when RACE operates above
roughly 10 kHz. High frequency wavelengths are so small that crosstalk cancellation is
essentially uncontrollable and random – resembling the randomness of concert hall reverberation
but infinitely lower in level. Any narrow peaks or dips produced by RACE when operating at
high frequencies are unlikely to be audible. Moreover, such peaks and dips at much lower
frequencies are inherent in stereophonic reproduction and these peaks and dips are removed by
RACE. So why provide an Algorithm control if the full range setting is almost always the best
choice? The control was provided because it was easy to create and because it might be fun to
experiment with it – and it has been used in laboratory studies. A-1 and B-1 have the same effect
and normally work best for general use. A-5 and B-5 produce a noticeably narrower stage.

4- and 6-Speaker Systems

Ambiophonic 4.x and 6.x systems provide a 360-degree sound field and slightly widen –
perhaps by 5 to 10 degrees – each side of the stage compared to the basic 2.x Ambiophonic
system. The 4.x and 6.x systems were developed not only to enhance reproduction in music-only
systems but also to provide surround sound for movies. A 4.x system contains two closely-
spaced front speakers and two closely-spaced rear speakers. Two different RACE programs are
needed, one for the front speakers and one for the rear speakers. One can reproduce 4.x or 5.x
DVDs or SACDs. One never needs a center speaker – front or rear. Set your player to the no-
center-speaker setting for both movies and music. The result is called Panambiophonics or
Panambio. If music recordings or movies are made with a direct sound, 180-degree wide rear
stage, then it is possible to hear voices, instruments, or sound effects anywhere in the horizontal
plane. Imaging at the extreme sides is easy to achieve, unlike 5.1 where such localization is
impossible. Normally, one can sit anywhere on the line between the two pairs of speakers and
experience the full circle of sound. With four speakers going, even off-center listening is usually
more enjoyable than off-center 5.1 listening since, with 4.x, it is harder to identify a speaker’s
location and one can still separate front sources from rear.

It is possible to make a four-channel music recording that records a 180-degree rear circle
of hall ambience for the rear two speakers and a 180-degree front circle of ambience (plus direct-
sound instruments) for the front two speakers. Such a recording can be reproduced
Panambiophonically to create a domestic concert hall which lacks only ceiling reflections to
mimic fully the hall in which the performance was recorded. Such recordings are now being
made on an experimental basis. The rear channels can include rear instruments for unusual
musical effects or sound effects if the recording is a movie soundtrack.

One can try Panambiophonics without cost by downloading RACE from the
Ambiophonics site and configuring Audiomulch with dual RACE chains in a PC. The RACE
chains look the same but will have different inputs and outputs for 4.x and the settings for the
chains should be slightly different. One can also obtain Panambiophonics commercially from
TacT Audio, which includes it in their TCS and Ambiophonics models. With either RACE PC
or TacT RACE, provision is made for switching easily between 2.x (Ambiophonic) and 4.x
(Panambiophonic) modes. The TacT boxes come with demonstration 4.0 DVD and SACD
surround music samplers.

A 6.x system simply adds side speakers to the 4.x configuration. The side speakers are
added to tickle the ears with side pinna cues and provide a head shadow when a side source is
reproduced. Since the pinna and head provide localization cues above roughly 200 Hz, side
speakers need operate only in that range. Side speakers produce only a very small improvement
in side imaging over a 4.x system. If you are engrossed in music or film, you may not notice the
improvement. It is provided for perfectionists. Ideally, the side speakers should be turned off or
limited to frequencies above 1,000 Hz when recordings made with the Ambiophone or other
dummy-head recording microphones are being played. (An Ambiophone is a special
microphone/recording set up using a baffled four-microphone dummy head without outer ears
designed to maximize realism when played back on Ambiophonic systems.)

The four speakers in a 4.x system will clearly outperform a 5.1 surround system. The 4.x
system will provide seamless surround without hot spots, without sonic gaps between surround
speakers, without localization to surround speakers, and with precise imaging at all angles
including side and rear imaging. Again, even though 4.x does not use side speakers, 4.x
provides better side imaging than 5.1, 7.1, or 10.2 systems. 6.x Ambiophonics is only a hair
better at side imaging than 4.x. Moreover, 4.x and 6.x Ambiophonic systems, with the front
speakers flanking the screen, provide a fine center image for dialogue without needing a center
speaker. But you do need to set the CD/DVD player to the no-center-speaker setting. The player
will then split the center signal and add it to the main left and main right speakers. Similarly, if
DTS or Dolby decoding is done in a processor, set it to divide the center channel equally
between the left and right front channels.

The only TacT Audio unit that currently provides for side speakers is the TCS. To
implement side speakers for the other TacT units, pass the front stereo pair through a Pro Logic
decoder in an outboard receiver or processor feeding an amplifier. Attach your left side speaker
to the left main speaker output and your right side speaker to the right main speaker output. Tell
the Pro Logic unit that five full-range speakers will be used. That way, only side signals (labeled
front left and front right) will be sent to the side speakers. The Pro Logic box will provide a
volume control. See below for more details.

Two-Channel Media with 4 or 6 speakers

When playing two-channel media such as CDs or LPs, feeding a RACE signal to the rear
speakers noticeably enhances stage width and depth – and makes all the various adjustments
seem less critical. The front and rear RACE signals should be similar but not identical in order
to avoid audible peaks and dips. Using the rear speakers on two-channel media insures that not
all hall ambience comes unrealistically from the front. For a side source, rear speakers provide a
second, quite different rearward pinna pattern that combines with the same-side frontal pinna

pattern. This allows the brain to localize more easily to the extreme side. The exact mechanism
of why this works is unknown. One possibility is that the final pattern that reaches the ear canal
averages out relatively flat and therefore seems to come from a direction where the ear canal has
a direct view of the sound source which, of course, is at the side. Another possibility is that this
novel pattern is unknown to the brain and thus is ignored. With two-channel media, using two
speakers directly behind the listening position increases stage width roughly 15 degrees and
enhances the feeling of depth and spaciousness. But it can also slightly reduce precision of the

One can add side speakers to a 4.x system, making it a 6.x system. Side speakers do not
use RACE. Instead, one can feed side speakers the unprocessed stereo signal to slightly widen
the stage and to slightly improve side imaging. Ideally, one puts the stereo signal through a Pro
Logic decoder and tells the decoder that there is a center speaker and rear speakers. Here, the
decoder will send a signal to the side speakers only when there is a strong, one-sided input.
Moreover, when stereo signals are applied to speakers 180-degrees apart, as are the side
speakers, a hole-in-the-middle develops and therefore the side speakers will have little or no
effect on the central part of the stage. Again, one adjusts the level of the side speakers to widen
and deepen the stage. In general, however, rear speakers work a bit better than side speakers at
widening the stage. But this usually requires two versions of RACE as in PC systems, TacT
TCS, and TacT Ambiophonics. While RACE-driven rear speakers widen and deepen the stage,
and add rear ambience, side speakers (with or without Pro Logic) can only help stabilize imaging
at the extreme sides where many movies have sound effects. You can use side and rear speakers
at the same time without problems as long as levels are not excessive. Again, as in 4.0, if the 2-
channel recording was made with an Ambiophone or with another dummy head, side speakers
may not be needed. If needed, they will work better if limited to frequencies above 1,000 Hz.

It is possible for owners of the TacT 2.2 XP to use its single RACE to drive 4 speakers.
But an audible suck out might occur if one is unlucky.

Speaker and Room Set Up

The preferred room set up for 2-, 4-, and 6-speaker Ambiophonics is the same. The rear
speakers need not be the same brand or type as the front speakers. They do not have to be at the
same angle as the front speakers or at the same distance from the listener. The settings for delay
and attenuation (spread) should be set a bit differently for the rear. The reason for locating front
and rear speakers at somewhat different distances and angles is that, when you set delay and
attenuation on front and rear to maximize stage width, the settings will be different. If everything
is too symmetrical, some averaging potential is lost. Normally, the level of the rears should be
about the same as the level of the fronts. But feel free to experiment with all the variables. The
goal is to produce the widest front stage possible for music or for movies that have sound effects
at the sides. When front and rear settings have produced the widest front stage, crosstalk
reduction is at a maximum. Once you know what settings produce maximum crosstalk
reduction, feel free to back off the settings to suit your taste or the peculiarities of a particular
recording. If you rotate to face the rear speakers, you should hear the same stage but reversed.

Additional Reading

Additional reading on these topics can be found at the Ambiophonics web site:


Glasgal’s Tonmeister Symposium paper (2005) can be found under “Technical Papers.”

September, 2008

You might also like