Importance of Direct Sound
Importance of Direct Sound
Importance of Direct Sound
Sound engineers need (almost) no convincing about the
importance of direct sound.
The sound image in most popular recordings is built
from close-miked sources.
Reverberation is added later as an enhancement, the
sauce that holds the sound together.
There is a fiction among classical engineers that the
hauptmicrophone picks up the direct sound
But in practice the image is created by accents, and
the main mike adds some early reflections.
My research into spatial acoustics started with sound
recording, and the hall research of Michael Barron.
Barron started with direct sound in front of listener and added a single
reflection at 40 degrees from the front. The diagram plots the spatial
impression that resulted as a function of time delay and the ratio of the
reverberation to direct (R/D). The range is -25dB tp +5dB
My research
I used a similar setup, but employed six or more
reflections at various angles and delays.
I obtained similar results and found it was the total energy of
the delays that mattered, not the amplitude of the individual
The theory of how the ear detects such reflections in the
presence of music followed, with many interesting results.
But the range of the energy of the reflections was the same as
Barrons about -25dB to +5dB with respect to the direct.
Dry speech
Note the sound is uncomfortably close
Same but with the reflections delayed 20ms at -5dB. (+5dB D/R)
Note also that with the additional delay the reflections begin to be heard as discrete echos.
But the apparent distance remains the same.
Same but with the reflections delayed 50ms at -3dB (+3dB D/R)
Now the sound is becoming garbled. These reflections are undesirable!
If the speech were faster it would be difficult to understand.
Note the Late reflections are at least 7dB more audible than the early ones!!!
And the sense of hall is all in the late reflections!!!
Concert Halls
Barron was interested in halls, not recordings!
The critical distance in Boston Symphony Hall (BSH) is ~
At this distance the D/R is 0dB. Almost all the listeners are
beyond this distance. The average D/R is below -8dB.
This theater suffers greatly from having the old Bolshoi next door!
Main Points
The ability to hear the Direct Sound the sound energy that travels
to the listener without reflecting is a vital component of the sound
quality in a great hall.
The ability to separately perceive the direct sound when the D/R is less
than 0dB requires time. There must be sufficient time between the
arrival of the direct sound and the build-up of the reverberation
Main Points 2
Current acoustic measures ignore both the D/R and the time gap
between the direct (the first wavefront) and the reverberation.
RT, C80, and EDT all ignore the strength of the direct sound, and the
effects of musical style on the audibility of the D/R
The strength of the reverberation depends on the length of a note compared
to the reverberation time. Short sounds do not excite a large hall, and the
D/R in practice can be much higher than expected from conventional theory.
There need to be gaps between notes sufficiently long that the reverberance
decays below the level of the new direct sound.
The direct sound from notes that differ in pitch by at least a musical fifth are
easier to distinguish.
The upward dashed curve shows the theoretical exponential rise of reverberant
energy from a continuous source. The seat position in the model has been chosen
so that the D/R is -10dB for a continuous note.
The upward solid line shows the actual build-up, and the downward solid line shows
the decay from a shorter note here a 100ms excitation. Note the actual D/R for the
short note is only about 6dB.
T10 the time for the reverberation to rise to 1/10 the final energy is less in Boston
than in Amsterdam, but after about 50ms the curves are nearly identical. (Without
the direct sound they sound identical.)
Smaller halls
What if we build a hall with the shape of
BSH, but half the size?
The new hall will hold about 600 seats.
The RT will be half, or about 1 second.
We would expect the average D/R to be the
same. Is it? How does the new hall sound?
If the client specifies a 1.7s RT will this make
the new hall better, or worse?
Half-Size Boston
The gap between the direct and the
reverberation and the RT have become
half as long.
Additionally, in spite of the shorter RT,
the D/R has decreased from about -6 in
the large BSH model, to about -8.5 in
the half-size model.
This is because the reverberation
builds-up quicker and stronger in the
smaller hall.
The direct sound, which was distinct in more than 50% of the seats in the large hall
will be audible in fewer than 30% of the seats in the small hall.
If the client insists on increasing the RT by reducing absorption, the D/R will be
further reduced, unless the hall shape is changed to increase the cubic volume.
The client and the architects expect the new hall to sound like BSH but they, and
the audience, will be disappointed. As Leo Beranek said about the Berlin
Philharmonie: They can always sell the bad seats to tourists.
Williams hall, in the same building, has ~350 seats in a square plan
with a high ceiling.
Once again the sound is clear and reverberant in most, if not all,
The audience usually sits where the
orchestra is rehearsing in this
The square plan keeps the average
seating distance low.
The high ceiling and high single
balcony provides a long RT without
a high reverberant level.
The absorbent stage reduces the
reverberant level while keeping the
direct sound strong.
Note the coffered ceiling similar to
A few Early lateral reflections can help blend together the orchestra
image, but they do not provide significant envelopment.
When the direct sound is adequate for localization, and there is lots of
late reverberation, the spatial perception of early reflections is inhibited.
You can often make the reflections in the time range of 20ms to 80ms
monaural with no change in sound.
This will provide loudness and clarity to the largest number of seats.
Threshold Data
Onset and azimuth thresholds allow hall sound to be
predicted from models!
1. Thresholds for azimuth detection.
Azimuth experiments are simple, and repeatable.
An important caveat!
The author has found that in a concert (with occasional visual input)
instruments (such as a string quartet) are perceived as clearly
localized and spread.
When I record the sound with probes at my own eardrums, and play it
back through calibrated earphones the sound seems highly accurate,
but localization often disappears!
Without visual cues when the d/r is below threshold the individual
instruments are localized and spread when they play solo, but collapse to
the center when they play together.
My brain will not allow me to detect this collapse when I am in the concert
hall even if I close my eyes most of the time!
With eyes closed it is more difficult to separate the sounds of the
individuals, such as the second violin and the viola. This difficulty persists
in the binaural recording.
Light models
I ran across these pictures while
cleaning out my office. The top
one is a too-simple model of the
Philadelphia Academy of Music.
The bottom is intended to be
BSH, but with a single balcony.
I abandoned light modeling
because it does NOT provide any
information about the time delay
gap nor information about the
effects of note length on D/R.
But it DOES provide information
about the D/R the total
reverberant energy compared to
the direct. And very complex hall
shapes can be quickly modeled.
Modeling T10
Classical acoustics predicts a starting value for d/r. We
can make a chart of d/r values in all the seats of a
proposed hall.
T10 does not follow easily from classical acoustics, but
can be predicted with fair accuracy with a simple
computer model of the hall. Just the basic hall
dimensions are needed.
From this data we can predict the localizability of sound
in all the seats.
The results can be surprising!
Auralization from these models (given accurate HRTFs) can be
Onset Enhancement
When d/r is low a small amount
of direct sound sharpens the
perceived onset of sounds, so
that a tone with a slow rise like
a cello is perceived more like a
The threshold for this effect is
lower than for azimuth detection,
and surprisingly, the highest
threshold is for the 1kHz band.
This result is mysterious.
Boston is blessed with two 1200 seat halls with the third shape, Jordan Hall at
New England Conservatory, and Sanders Theater at Harvard. The sound for
chamber music and small orchestras is fantastic. RT ~ 1.4 to 1.5 seconds.
Clarity is very high you can hear every note and envelopment is good.
Binaural Measures
The author has been recording
performances binaurally for years.
Current technology uses probe
microphones at the eardrums.
We can use these recordings to
make objective measurements of
halls and operas.
The methods use a hearing model where the binaural signal is first filtered into
1/3 octave bands, and then is rectified and filtered.
For measures of localization a running IACC is calculated in 10ms overlapping
windows. The maximum values of 1/(1-IACC) are then plotted as a surface
over time and frequency band.
The figure shows the number of
times per second that a solo violin
can be localized from row 4 of a
small shoebox hall (~500 seats)
near Helsinki.
It also shows the perceived
azimuth of the violin
As can be seen, the localization
achieved at the onsets of notes
is quite good, and the azimuth,
~10 degrees to the left of center,
is accurate.
Localization surface1
Here we plot the same data
for the violin as a function
of (inverse) azimuth, and
the third octave frequency
As can be seen, for this
instrument the principle
localization components
come at about 1300Hz.
Interestingly, Human ability
to detect azimuth, as
shown in the threshold
data, is maximum at this
Localization, Surface 2
Here we plot 1/(1-IACC) as
a function of time and third
octave band.
Note that the IACC peaks at
the onset of notes can have
quite high values for a brief
This happens when there is
sufficient delay between the
direct and the reverberation,
and sufficient D/R.
Another singer
The king (in Verdis Don Carlos) on the
other hand, in his wonderful solo aria,
was not able to reach the third balcony
with the same strength.
Like the localization graph shown
previously, this graph seems to be
mostly noise.
The fundamental pitches are not well
defined. The singer seemed muddy
and far away.
His aria can be heart-rending but
here it was somewhat muted by the
acoustics. We were watching the king
feel powerless and forlorn. We were
not involved.
Current hall measurements ignore both the D/R and the time gap
between direct and reverberation.
Better measures exist. They must be used if the current practice of hall
design is to be improved.