Book Review: David HURON. Voice Leading: The Science Behind A Musical Art

Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/317035388

Book Review: David HURON. Voice Leading: The Science behind a Musical Art.

Article  in  Music Analysis · June 2017

CITATIONS READS

0 584

1 author:

Fabian C. Moss
École Polytechnique Fédérale de Lausanne
8 PUBLICATIONS   5 CITATIONS   

SEE PROFILE

All content following this page was uploaded by Fabian C. Moss on 26 November 2017.

The user has requested enhancement of the downloaded file.


David HURON. Voice Leading: The Science behind a Musical Art

Cambridge, MA: MIT Press, 2016. ISBN 9780262034852. 272 pp. $38.

___________________________________________________________________________

David Huron’s book on voice leading is the state-of-the-art account of the psychological

principles that govern the perception of individual voices in a piece of music. It is not yet

another instruction for part writing in Bach chorale style. Rather, it is the culmination of

decades of scientific research, a great deal of it done by Huron himself,1 showing how much

of the traditional canon of voice-leading rules can indeed be explained by empirical research

on auditory perception. It bears mentioning that Huron understands voice leading as a rather

broad concept and loosely defines it as “the art of combining concurrent musical lines or

melodies” (1). The fundamental thesis of Huron’s book states that voice-leading rules are

consequences of psychological principles determining how we perceive auditory streams.

These principles may serve as guidelines for any composer pursuing the goal of maintaining a

coherent auditory scene—that is, one with clearly distinguishable individual voices at any

given moment. Of course, Huron is aware that composers may or may not choose to follow

the strategies that benefit auditory scene analysis, much as a painter might or might not

follow the rules of perspective. But taking into account the multitude of all potential

Financial support for the author has been provided by the Zukunftskonzept at TU Dresden

(ZUK 64), funded by the Exzellenzinitiative of the Deutsche Forschungsgemeinschaft (DFG).


1
E.g., David Huron, “Tone and Voice: A Derivation of the Rules of Voice-Leading from

Perceptual Principles,” Music Perception 19/1 (2001), 1–64,

https://doi.org/10.1525/mp.2001.19.1.1.
compositional goals (for example, as influenced by social, economic, or aesthetic factors)

would go beyond the confines of a single coherent theory. Consequently, Voice Leading

restricts itself to one of them: the goal of creating coherent auditory scenes. In doing so,

Huron grounds his approach in Albert Bregman’s psychoacoustical theory of auditory scene

analysis.2 He aims to “[p]rovide a scientific explanation for the core part-writing rules” as

well as a “a science-based account of musical texture” (vii). Both objectives are

accomplished with detailed but very accessible explanations of a variety of intricate

psychological findings. This book thus bridges the still large gap between music theory and

music psychology by emphasizing that a better understanding of how we listen can lead to a

better understanding of why music is created in a particular way (i.e., why certain voice-

leading rules pertain to the canon, or why certain composers chose to write in a particular

style).3

Anticipating scholarly criticism regarding the apparent arbitrariness of canonical

voice-leading rules, Huron explicitly stresses that his book should by no means be understood

as laying out the headgear for composers. Hence, his approach is fundamentally not

prescriptive but is meant as a “roadmap that describes what happens when a musician

chooses one path rather than another” (2) in order to accomplish a specific musical goal. This

argument leaves the whole issue of the composer’s free will untouched, but imposes high

demands on his or her rationality. Importantly, this approach is not restricted to the canon of

the common-practice period but extends to other styles and genres. By showing how each of

2
Albert Bregman, Auditory Scene Analysis: The Perceptual Organization of Sound

(Cambridge, MA: MIT Press, 1990).


3
Eric Clarke, “Mind the Gap: Formal Structures and Psychological Processes in Music,”

Contemporary Music Review 3/1 (1989), 1–13., https://doi.org/10.1080/07494468900640021


the voice-leading rules in Huron’s renewed canon derives from optimal ways of achieving

this musical goal, they become meaningful, reasonable, and open to scrutiny.

Voice Leading is divided into seventeen chapters. All of the core chapters conclude

with a reprise that summarizes the essential points and integrates the findings into the

overarching context.

Chapters 2 and 3 lay out the basic music-theoretical and psychological framework.

After pointing out the ubiquity of disagreement over which voice-leading rules the canon

should include, chapter 2 (“The Canon”) introduces a preliminary set of sixteen rules

describing Baroque chorale-style voice leading as it is probably taught in most introductory

music theory classes nowadays. Chapter 3, “Sources and Images,” introduces basic concepts

from auditory perception. Huron describes in a concise and very accessible manner the

phenomenon of auditory scene analysis: how sounds from acoustic sources, made up from

elementary partial tones, find their way through the inner ear into the brain to form auditory

images (e.g., the percept of a piano tone) or streams (e.g., the percept of an ascending major

scale played by a harp), drawing a sharp line between the acoustic (physical) and the auditory

(physiological and psychological) realms. The listener’s reward for successful auditory scene

analysis is a subjective feeling of pleasure. “Successful” in this sense does not necessarily

mean that the auditory scene is in one-to-one correspondence with the acoustic reality, but

rather that the evoked auditory images are consistent with one another (leaving no room for

ambiguities). Relating successful auditory scene analysis to the experience of pleasure lies at

the heart of Huron’s explanation of voice-leading rules: they increase pleasure by facilitating

auditory scene analysis in a variety of ways.

Chapters 4, 5, and 6 provide a more detailed account of how auditory scenes are

constructed from acoustic sources. Chapter 4, “Principles of Image Formation,” introduces

two. The first is harmonic fusion, which describes the mental phenomenon whereby a single
auditory image is formed from multiple sound sources. Harmonic fusion depends mostly on

the frequency ratio of the respective fundamentals of the sounds: tones in harmonic, or

consonant, distance are more prone to fusion. The second, the toneness principle, describes

the degree to which a sound sounds “like a tone” (as opposed to noise) and has a clear pitch.

It turns out that toneness for complex tones (e.g., as produced by musical instruments) is high

for pitches between E2 and G5 (roughly the maximal ambitus of a chorale) and optimal

around D4, located roughly at the “middle” of the staff. Chapter 5 covers the topic of

auditory masking, which describes the potential interference of several sound sources with

one another. Masking occurs when the partial frequencies of two or more tones are too close

to each other (less than a critical bandwidth) to be resolved into separate auditory images.

The effect is that one tone (partially) “masks” another. Hence, a composer who wants to

convey several independent voices must design them in a way that avoids auditory masking,

following the minimum masking principle. Artifacts of this strategy are the voicing

techniques in different registers (true to the motto “the lower you get, the more you spread”),

and the fact that most listeners can focus more easily on the melody when it appears in the

soprano because in general, for voices similar in loudness and timbre, higher-pitched voices

are less affected by masking from lower-pitched voices than vice versa (the high-voice

superiority effect). Since the degree of masking correlates with the difficulty of distinguishing

among several distinct voices, masking creates a “feeling of irritation or annoyance” (52),

which in turn might lead to some form of behavioral reaction (leaving the room, asking

somebody to speak louder, focusing attention on a particular instrument) to improve hearing.4

Huron favors the term “(sensory) irritation” over “sensory dissonance,” commonly used in

psychoacoustical research, to emphasize that, while irritation carries a negative connotation

4
Note that masking can occur not only in musical contexts but in all hearing circumstances.
and describes phenomena leading to avoidance behavior, dissonance in music is not a

criterion for quality: “The best music is not music that avoids dissonance” (56). In other

words, there is a lot of good music that is quite dissonant. Having considered so far two or

more tones sounding simultaneously at a given moment, Huron turns in chapter 6

(“Connecting the Dots”) to the succession of such musical events in time. Mentally

connecting multiple events to form a single stream is facilitated when the events are either

contiguous or separated by only a short duration of about 800 milliseconds (the continuity

principle). The continuity principle is at work, for instance, in latent polyphony: as is well-

known, a single monophonic instrument (such as a human voice, a violin, or a cello) can

evoke the impression of several distinct lines. This effect depends on both the relative

duration and the pitch distance between the virtual voices. If both factors are below a certain

threshold (“trill boundary” [72]), the result is the perception of a single stream. If both are

above another threshold (“yodel boundary” [72]), the result is two or more separate streams.

This is reflected, for example, in the theoretical step/leap distinction and also in the fact that

in many of the world’s cultures, melodies are usually connected by steps5—that is, intervals

below the trill threshold. Huron coins this observation the pitch proximity principle. Similar

or parallel motion also contributes to grouping tones together. Importantly, this holds not

only for complex tones (e.g., as entities in contrapuntal contexts), but also for all the partials

of each of the moving complex tones. They do so most strongly when they are moving

strictly parallel and are harmonically related (e.g., unisons, octaves, fifths, and so on). The

conjoint motion of two tones is called co-modulation and forms Huron’s sixth perceptual

principle. Co-modulation is strongest if the two voices move in parallel. Obviously, this goes

directly against the perception of independent voices. It is easy to see why this principle is at

5
Huron, “Tone and Voice,” 25.
the basis of the strong and long-standing advice to novice composers to avoid parallel octaves

and fifths.

The six fundamental perceptual principles shaping the formation of individual

auditory images introduced up to this point are thus harmonic fusion, toneness, minimum

masking, continuity, pitch proximity, and pitch co-modulation. Chapter 7 (“Preference

Rules”) builds on the established framework and re-evaluates the voice-leading rules as

preference rules to “clarify the logic, address the pertinent details, and make it easier to see

unanticipated repercussions” (87). The concept of preference rules follows the tradition of A

Generative Theory of Tonal Music and The Cognition of Basic Musical Structures,6 both of

which point out that these should be understood as guidelines rather than prescriptive

directives. The objective of Voice Leading is to show which auditory principles facilitate the

perception of individual voices in music, a central goal in many musical styles. Accordingly,

the preference rules that Huron introduces are empirically based means to achieve this goal

(G1), explicitly stated as follows:

G1. The goal of voice leading is to facilitate the listener’s mental construction of

coherent auditory scenes when listening to music. In practical terms, the goal of voice

leading is to create two or more concurrent yet perceptually distinct “parts,” “voices,”

or “textures.” (88)

Starting from this goal, Huron derives a set of twenty-three preference rules by successively

exploring the consequences of the six perceptual principles, reformulated as empirical

observations (E) and corollaries (C). The implications include well-known compositional

6
Fred Lerdahl and Ray Jackendoff, A Generative Theory of Tonal Music (Cambridge, MA:

MIT Press, 1983); David Temperley, The Cognition of Basic Musical Structures (Cambridge,

MA: MIT Press, 2001).


advice, such as the overall ambitus for part writing, directly derived from the toneness

principle mentioned above. They also contain less obvious preference rules such as the

“toneness rule,” which requires “[p]refer[ring] the use of harmonic complex tones—tones

that evoke clear pitch sensations” (89) (obviously, this rule is tacitly followed when writing

tonal music). Furthermore, they enclose new preference rules that, according to Huron,

exceed the extent of the traditional canon, such as the “Oblique Preparation Rule,” meaning

that “[w]hen approaching unisons, octaves, fifteenths, twelfths, or fifths, it is preferable to

retain the same pitch in one of the voices (i.e. approach by oblique motion)” (94). Probably

the most famous voice-leading rule is the “Perfect Parallels Rule” (95), which Huron

elegantly deduces from the principles of harmonic fusion and pitch co-modulation. In each

case, Huron points out how the rule helps to achieve the above stated goal. Again, he

emphasizes that one should be careful not to confuse this particular goal of voice leading with

the goals of a composer or a “universal musical imperative” (88). Obviously, composers

could seek to pursue very different, concurrent, or even contradictory aims.

Out of context, the enumeration of statements in this chapter might seem tedious and

mechanical, even arbitrary (the classic criticisms of all kinds of compositional rules). But

given the background of the empirical research provided in this book, the reader is easily

convinced that voice-leading rules were not randomly set up once upon a time by music

theorists but are indeed shaped by basic features of our auditory perception and hence

grounded in music cognition.

Chapter 8 (“Types of Part-Writing”) adds four more perceptual principles contributing

to the formation of separate auditory streams: onset asynchrony (as in polyphony), limited

density (the fact that it is hard even for professional musicians to identify more than three

concurrent voices), timbral differentiation (it is easier to disentangle a violin and a trumpet

than two flutes), and finally, source location (spatial clues for voice segregation). These
“auxiliary principles” partially account for stylistic differences (e.g., homophony vs.

polyphony) and individual choices by composers. Consequently, this approach offers a useful

framework for comparing different musical styles, including popular and non-Western styles.

Following the more general observations of the previous sections, chapters 9–12 go

into more detail and consider specific functions of single tones. Chapter 9 (“Embellishing

Tones”) is dedicated to the function of embellishing tones such as suspensions, passing tones,

neighbor notes, and pedal tones, and shows how they contribute to the individuality of

streams, for instance by drawing attention to a middle voice. This is summarized in the

attention principle and three more preference rules. Chapter 10 (“The Feeling of Leading”)

focuses on “tendency tones” (e.g., leading tones and chromatic alterations) and distinguishes

between perception-related bottom-up principles and expectation-based top-down processes,7

namely veridical expectations based on the knowledge of specific pieces, schematic

expectations resulting from the familiarity with particular musical styles, and dynamic-

adaptive expectations related to the immediate musical context as retained in short-term

memory (e.g., expecting that a repetition will be a copy of what we heard earlier, or

recognizing a recurring fugue subject). In Huron’s terminology, “part-writing” is mostly

determined by bottom-up perceptual principles, whereas “voice leading” also takes advantage

of top-down expectations, which enables listener to anticipate, for instance, how a melodic

line might continue. In short, “predictability transforms good part-writing into good voice

leading” (144). Six preference rules are derived from this expectation principle. Chapter 11

(“Chordal-Tone Doubling”) comes to the conclusion that, despite disagreement among

theorists, there is not much to worry about from a perceptual perspective, since most

7
For more detail, see David Huron, Sweet Anticipation: Music and the Psychology of

Expectation (Cambridge, MA: MIT Press, 2006).


recommendations for pitch doubling lead to the same result: they reduce potential masking,

pitch co-modulation, and harmonic fusion (mostly by avoiding unwarranted parallels).

Chapter 12 (“Direct Intervals Revisited”) focuses on the recommendation to avoid movement

into perfect consonances by similar motion (hidden parallels), pointing out the underlying

principles (harmonic fusion, pitch co-modulation, and pitch proximity). Huron explains how

abiding by these principles ensures stream segregation and thus the individuation of voices.

In brief, rules about embellishing and leading tones, as well as chordal-tone doubling and

motion into perfect consonances, can be derived from combinations of only a few basic

perceptual principles.

Having considered the empirical findings on very specific aspects of the treatment of

individual tones in voice leading, chapters 13 and 14 zoom out again and reflect upon entire

streams and musical scenes. Chapter 13 (“Hierarchical Streams”) introduces a dendrogram-

like visualization in a hierarchical manner, so-called scene analysis trees. These are meant to

depict how auditory images group together, from resolved partials to tones, chords, and entire

streams, to form coherent auditory scenes. The height of the respective branching refers to the

degree to which stream segregation is more synthetic (holistic) or analytic (individual).

Crucially, this representation only shows an auditory scene at a given moment in time and

does not capture the dynamic changes of stream formation and segregation over the course of

a piece. Chapter 14 (“Scene Setting”) gives examples of several listening situations and

illustrates them with associated scene-analysis trees (i.e., representations of the mental

hierarchical relationships among auditory images). More precisely, they allow for the

description of “how that [acoustic] scene might be parsed by listeners into a corresponding

auditory scene” (182). Parsing—inferring an underlying structure from a given input (in this

case the sequential presentation of acoustic scenes)—is defined by the principles of auditory

scene analysis, as introduced in the preceding chapters. Conversely, these principles should
be used generatively by composers who wish to create coherent musical scenes in virtually

any genre, since the design of musical scenes is identified as “[o]ne of the main

compositional tasks in any musical style” (178). Huron recapitulates that

voice leading provides the toolkit for designing auditory scenes. The rules of voice

leading aren’t simply tools for creating polyphonic music or Baroque-style chorales;

they are tools that allow composers to construct and control any kind of musical

texture. Voice leading truly is the art of combining concurrent musical lines. (179)

So why is it that Baroque style four-part writing is usually at the core of music theory

courses? Huron conjectures that this is not just a coincidence or a matter of arbitrary

historical traditions. He hypothesizes that it is a consequence of the fact that “[n]o other

historical practice conforms so closely to modern perceptual and cognitive research regarding

how independent sounds are heard in complex acoustic scenes” (181). With this chapter,

Huron concludes the core of the book, having built up a coherent theory of voice leading

starting from elementary observations about how pure and complex tones evoke auditory

images and how they can be combined to form a number of distinguishable auditory streams.

The principles and rules developed along the way specify how composers can achieve the

goal of maintaining coherent auditory scenes—or avoid it, if they so choose.

The discussion is put into a broader context in chapters 15 and 16. A short digression

to the topic of how learning and culture shape our auditory perception is made in chapter 15

(“The Cultural Connection”). By referring to the discussion about universals and innateness,

Huron advocates the view that “learning is the sole source for top-down auditory scene

analysis” (193) and also plays an important role for certain bottom-up processes, thus

emphasizing the importance of the cultural environment for the perception and appreciation

of musical hierarchies and other listening situations such as speech, or rhythmic grouping in

language. In the last chapter (“Ear Teasers”) Huron addresses the question of why the
perceptual individualization of multiple streams should be a musical goal in the first place—

in other words, why voice leading is a desirable musical feature leading to pleasure at all.

Huron argues that “the rules [of voice leading] help to reduce perceptual ambiguity and so

facilitate auditory scene analysis” (197) and “that the brain rewards itself for successfully

parsing auditory scenes and that the evoked pleasure is proportional to the scene’s

complexity” (204). Accordingly, an “ear teaser” is “any complex musical texture or acoustic

scene that nevertheless affords clear scene-analysis parsing, with a consequent pleasurable

effect for listeners who successfully resolve the sensory challenge” (198). While the theory

developed so far accounts for how musical scenes are set, Huron concedes that there is to

date almost no data on how these scenes change over time in a listener’s mind, so this

remains an important task for further research.

The conclusion (chapter 17) summarizes the twelve perceptual principles introduced

in the book (thus incidentally providing an excellent introduction) and revisits the two central

concepts of pleasure (as a reward for successful auditory scene analysis) and scene setting (as

a fundamental task for a composer in any style), both of which are modulated by the

workings of the principles that underlie voice leading. He restates the renewed canon of

thirty-seven voice-leading rules and concludes with some remarks on how performers can

take advantage of the research presented in Voice Leading. The book closes with an

afterword giving pedagogical advice on how its content might be integrated into the music

theory curriculum. Somewhat surprisingly, Huron discourages teachers from using his book

as an introduction to the topic of voice leading, arguing that the principles developed in his

book and the psychological underpinnings might be too complex for a novice who is still

struggling with the intricacies of basic part writing. However, to the more advanced student,

the empirical accounts of voice-leading rules might provide valuable insights about their

underlying causes.
The greatest strengths of Voice Leading are both its extensiveness and its

comprehensibility. Covering the whole range from bottom-up sensory principles (such as the

perception of partials) to top-down cognitive processes (such as several kinds of

expectations), Huron explains complicated psychoacoustical processes in a very accessible

manner without delving into distracting particularities. The relevant phenomena are described

in an understandable language, often supported by illustrative diagrams and graphs, as well as

explanations in the form of metaphorical everyday situations. In the same spirit, Huron

refrains from stringing together lists of numbers or formulae. The reader interested in more

detailed accounts can always refer to the extensive references, which cover virtually all

relevant empirical literature, from late nineteenth-century to present-day publications. The

book’s index also allows one to quickly find relevant passages. Voice Leading is thus ideally

suited for a broad audience lacking prior knowledge of empirical research on music

perception, a convenient read for musicians, music theory scholars and teachers, and a

general audience interested in music’s psychological background. Readers are introduced to

some basic scientific terminology, such as the difference between causation and correlation,

parsing and generation, top-down and bottom-up approaches, and statistical techniques such

as multiple regression. It might, however, come as a surprise for reader with a background in

music theory that, compared with traditional counterpoint treatises, Voice Leading contains

almost no examples from the musical repertoire. The first score to appear is an excerpt from

Ravel’s Bolero, and a few others follow; but the literate reader will have no difficulty

transferring the findings to other examples. It serves Huron’s goal to show that the voice-

leading principles he explains extend far beyond a narrow corpus of Renaissance polyphonic

writing, Baroque four-part chorales, and the like.

A few minor imperfections need mentioning. Although Huron’s description of the

rules of voice leading is based on general auditory principles, the plausibility of the goal of
voice leading (creating coherent auditory scenes) is largely accounted for by his own

extensive corpus studies of the music of Bach. Naturally, this calls for more extensive work

on Bach’s precursors and contemporaries, as well as later composers and different styles, to

draw conclusions about the diachronic changes of the usage of voice leading in order to draw

inferences about plausible compositional goals.

The climax of Huron’s endeavor is the hierarchical representation of auditory scenes

in scene-analysis trees (chapters 13 and 14). Scene-analysis trees are an instance of

hierarchical clustering, of how elements clump together to form larger units according to

certain principles (in this case the principles underlying the formation of auditory streams).

The distinction between analytic and synthetic hearing related to the branch length is

conceptually evident but somewhat imprecise, since this distance is not quantified as in a real

dendrogram plot. This approach deserves more attention in subsequent research and ought to

be formalized more accurately. Moreover, these representations are inherently unsuited to

model dynamic processes in time, but rather depict momentary analyses. Owing to an

unfortunate gap in the research literature, the temporal dimension is still opaque, but as

Huron writes, “[T]he study of voice leading remains a work in progress rather than a finished

opus. With future research, the interpretations I have offered in this book will be corrected,

augmented, or replaced” (214), and new forms of representation might turn out to be more

convenient. In summarizing the relevant empirical research and formulating testable

hypotheses, Huron explicitly encourages scholars to continue and extend the scientific

investigation of voice leading.

It is worthwhile to come back to Huron’s swift rejection of chordal-tone doubling

rules in chapter 11. As already mentioned, he argues that avoiding chordal-tone doubling

(especially the doubling of the leading tone) mainly helps to prevent parallel octaves. His
argument, though, relies on corpus studies of works by Bach, Haydn, and Mozart,8 and,

implicitly, on the psychoacoustical definition of an octave based on fundamental frequency

ratios, rather than the broader theoretical conception of an octave as an interval spanning

seven scale steps, no matter its specific size and acoustical realization. Doubtless, traditional

voice-leading rules use the concept of the octave ambiguously and rely on both definitions,

which makes it plausible that they also were formulated to prevent other consequences of

tone doubling than octave parallels. In modal Renaissance music, in particular, doubling can

lead to cross-relations because of musica ficta, where the direction of the voice determines its

realization as natural or altered.9 As an example, take the extreme case of the end of Thomas

Tallis’s hymn O nata lux (Example 1), where the cross-relation between two tones sounds

simultaneously, producing an augmented octave when the sharpened F in the soprano

clausula coincides briefly with the F<natural> in the Phrygian tenor clausula (a mi contra fa

situation). This sharply dissonant vertical major-minor sonority is inharmonic (in Huron’s

terminology, which is related to the configuration of the partial tones) and momentarily

suspends the harmonic fusion and toneness principles. Subsequently, instead of resolving the

dissonance into a consonance, the F in the tenor descends to E<flat>, creating a minor second

in the lower register. According to the minimum masking principle, they should

8
David Huron, “Chordal-Tone Doubling and the Enhancement of Key Perception,”

Psychomusicology 12 (1993), 73–83, https://doi.org/10.1037/h0094115; and Bret Aarden and

Paul T. von Hippel, “Rules of Chord-Tone Doubling (and Spacing): Which Ones Do We

Need?” Music Theory Online 10/2 (2004),

http://www.mtosmt.org/issues/mto.04.10.2/mto.04.10.2.aarden_hippel.html.
9
See David Trendell, “After Josquin,” Early Music 35/1 (2007), 139–41,

https//doi.org/10.1093/em/cal120.
approximately form at least a perfect fourth to exceed the critical bandwidth and to be clearly

separated by the cochlea into two distinct tones (43, Figure 5.2).

Example 1: Thomas Tallis, O nata lux, end of hymn.

Admittedly, examples like this one are rare and really stand out, confirming that our

ears (more precisely, all components involved in the creation of auditory scenes) rely on

perceptual principles in the first place. Furthermore, the dissonant intervals are approached by

oblique motion, providing a listener with another cue for resolving the voices: dissonance

preparation, or onset asynchrony, Huron’s seventh principle. Presumably, the scarcity of

examples like this one have little impact on the overall statistics. But, more important, one

can clearly see how it is possible for a composer to abandon some principles for the sake of

others, if musical goals other than stream segregation are pursued. This is in total compliance

with Huron’s constant mantra that voice leading is but one of a virtually infinite array of

potential compositional goals, albeit an important one, since it neatly relates to biological and

psychological principles of how we make sense of the acoustic world surrounding us by

transforming it into an auditory scene.


The book is also a contribution to the more general discourse about the relationship

between music theory and music psychology. Owing to the growth of the discipline of music

cognition in the last decades, this vital issue has been addressed several times by such

researchers as Ray Jackendoff and Fred Lerdahl, Eric Clarke, Carol Krumhansl, David

Temperley, Geraint Wiggins, and Martin Rohrmeier.10 Oftentimes, scholars state that music

theoretical descriptions and analyses rely in fact on implicit assumptions about the cognitive

states or capacities of a musical agent, be it a composer, a listener, or a performer, and

acknowledge the need for more interdisciplinary exchange, instead of accusing each other of

being reductionistic in one sense or another. With Voice Leading, Huron accepts this

challenge and shows that music theory and music psychology can indeed form a powerful

alliance. Furthermore, Huron supplements the existing literature on the matter, such as

pedagogical treatises11 on how to write a good counterpoint, and mathematical/geometrical

10
Ray Jackendoff and Fred Lerdahl, “Generative Music Theory and Its Relation to

Psychology,” Journal of Music Theory 25/1 (1981), 45–90, https://doi.org/10.2307/843466;

Clarke, “Mind the Gap”; Carol L. Krumhansl, “Music Psychology and Music Theory:

Problems and Prospects,” Music Theory Spectrum 17/1 (1995), 53–80,

https://doi.org/10.2307/745764; David Temperley, “The Question of Purpose in Music

Theory: Description, Suggestion, and Explanation,” Current Musicology 66 (1999), 66–85;

Geraint A. Wiggins, “Music, Mind and Mathematics: Theory, Reality and Formality,”

Journal of Mathematics and Music 6/2 (2012), 111–23,

https://doi.org/10.1080/17459737.2012.694710; and Martin Rohrmeier, “Musical

Expectancy—Bridging Music Theory, Cognitive and Computational Approaches,” Zeitschrift

der Gesellschaft für Musiktheorie 10/2 (2013), 343–71.


11
E.g., Edward Aldwell and Carl Schachter, with Allen Cadwallader, Harmony and Voice
approaches12 to the classification of possible kinds of voice leading with a data-based account

of what can actually be found in musical corpora.

The book focusses on how our perceptual predispositions shape the way we hear, and,

consequently, how music that aims to evoke pleasure by setting up a rich but not overly

complex auditory scene can achieve this goal by following the rules proposed by the author.

Therefore, Voice Leading does indeed achieve to explain “the science behind a musical art.”

Reviewed by Fabian C. Moss

About the author

Fabian C. Moss studied music, mathematics, and educational studies at the University of

Cologne and the Hochschule für Musik und Tanz Köln, from which he also holds a MA in

musicology. Currently he is pursuing a Ph.D. in music cognition at the Technische

Universität Dresden. His research interests include the connection between music

theory and cognition, especially formal and computational approaches to chromatic harmony

and extended tonality.

Leading, 4th ed. (N.p: Cengage Learning, 2010).


12
E.g., Dmitri Tymoczko, A Geometry of Music: Harmony and Counterpoint in the Extended

Common Practice (Oxford and New York: Oxford University Press, 2011).

View publication stats

You might also like