Academia.eduAcademia.edu

A periodicity-based theory for harmony perception and scales

2009

Empirical results demonstrate, that human subjects rate harmonies, e.g. major and minor triads, differently with respect to their sonority. These judgements of listeners have a strong psychophysical basis. Therefore, harmony perception often is explained by the notions of dissonance and tension, computing the consonance of one or two intervals. In this paper, a theory on harmony perception based on the notion of periodicity is introduced. Mathematically, periodicity is derivable from the frequency ratios of the tones in the chord with respect to its lowest tone. The used ratios can be computed by continued fraction expansion and are psychophysically motivated by the just noticeable differences in pitch perception. The theoretical results presented here correlate well to experimental results and also explain the origin of complex chords and common musical scales.

10th International Society for Music Information Retrieval Conference (ISMIR 2009) A PERIODICITY-BASED THEORY FOR HARMONY PERCEPTION AND SCALES Frieder Stolzenburg Hochschule Harz, Automation & Computer Sciences Department, 38855 Wernigerode, GERMANY [email protected] ABSTRACT minor chords. Finally, we will highlight the psychophysical basis of the proposed approach, by reviewing some recent results from neuro-science on periodicity detection of the brain, and end up with conclusions (Sect. 5). Empirical results demonstrate, that human subjects rate harmonies, e.g. major and minor triads, differently with respect to their sonority. These judgements of listeners have a strong psychophysical basis. Therefore, harmony perception often is explained by the notions of dissonance and tension, computing the consonance of one or two intervals. In this paper, a theory on harmony perception based on the notion of periodicity is introduced. Mathematically, periodicity is derivable from the frequency ratios of the tones in the chord with respect to its lowest tone. The used ratios can be computed by continued fraction expansion and are psychophysically motivated by the just noticeable differences in pitch perception. The theoretical results presented here correlate well to experimental results and also explain the origin of complex chords and common musical scales. 1.2 Basic Musical Notions Before we are able to address the problem of harmony perception, we should clarify the terminology we use. For this, we follow the lines of [2]. The basic entity we have to deal with is a tone: A pure tone is a tone with a sinusoidal waveform. It has a specific pitch, corresponding to its perceived frequency f , usually measured in Hertz (Hz), i.e. periods per second. In practice, pure tones almost never appear. The tones produced by real instruments like strings, tubes, or the human voice have harmonic or other overtones. The frequencies of harmonic overtones are integer multiples of a fundamental frequency f . For the frequency of the n-th overtone (n ≥ 1), it holds fn = n · f , i.e. f1 = f . The amplitudes of the overtones define the spectrum of a tone or sound and account for its loudness and specific timbre. A harmony in an abstract sense can be identified by a set of tones forming an interval, chord, or scale. Two tones define an interval, which is the distance between two pitch categories. The most prominent interval is the octave, corresponding to a frequency ratio of 2/1. Since the same names are assigned to notes an octave apart, they are assumed to be octave equivalent. An octave is usually divided into 12 semitones in√western music, corresponding to a frequency ratio of 12 2 in equal temperament (cf. Sect. 3.3). Thus, intervals may also be defined by the number of semitones between two tones. A chord is a complex musical sound comprising three or more simultaneous tones, while a scale is a set of musical notes, whose corresponding tones usually sound consecutively. Both can be identified by the numbers of semitones in the harmony. A triad is a chord consisting of three tones. Classical triads are built from major and minor thirds, i.e., the distance between successive pairs of tones are 3 or 4 semitones. For example, the major triad consists of the semitones {0, 4, 7}, which is the root position of this chord. An inversion of a chord is obtained by transposing the currently lowest tone by an octave. Fig. 1 (a) shows the three inversions of the E major chord, including the root position. Fig. 1 (b)–(e) shows all triads that can be build from thirds including their inversion, always with e′ as lowest tone. Fig. 1 (f) shows the suspended chord, built from perfect fourths (5 semitones). Its last inversion, consisting of the semitones {0, 5, 10}, reveals this. 1. INTRODUCTION 1.1 Motivation Music perception and composition seem to be influenced not only by convention or culture, manifested by musical styles or composers, but also by the psychophysics of tone perception [1–3]. Thus, in order to better understand the process of musical creativity and information retrieval, the following questions should be addressed: • What are underlying (psychophysical) principles of music perception? • How can the perceived sonority of chords and scales, in particular of western music, be explained? Therefore, in the rest of this section (Sect. 1), we will introduce basic musical notions and results. After that, we will briefly review existing psychophysical theories on harmony perception (Sect. 2), which are often based on the notions dissonance and tension, taking harmonic overtone spectra into account. In contrast to this, the approach presented here (Sect. 3) is simply based on the periodicity of chords. Applying this theory to common musical chords and also scales (Sect. 4), shows a very good correlation to empirical results, that e.g. most subjects prefer major to Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. c 2009 International Society for Music Information Retrieval. 87 Poster Session 1 ¯¯ 4¯¯¯ ¯ G 4¯¯ 4¯ (a) triads 4¯¯¯ 6¯¯¯ 4¯¯¯ (b) major 2¯¯ 2¯¯ 2¯2¯ ¯ ¯ ¯ ¯¯ 4¯4¯ 6¯¯ ¯ ¯ ¯ (d) diminished (c) minor 4¯4¯ ¯ (e) augmented ¯ 4¯¯ ¯ ¯ ¯¯ ¯ (f) suspended Figure 1. Triads and their inversions. (a) 2. THEORIES ON HARMONY PERCEPTION Chord classes lead to different musical modes. The major chord is often associated with emotional terms like happy, strong, or bright, and, in contrast to this, the minor chord with terms like sad, weak, or dark. Empirical results (see e.g. [4]) reveal a preference ordering on the perceived sonority of the triads as follows: major ≺ minor ≺ diminished ≺ augmented. Since all these triads are built from thirds, thirds do not provide an explanation of this preference ordering on its own. Therefore, let us now review existing theories on harmony perception, discussing some of their merits and drawbacks. (b) (c) (d) 2.1 Explanation by Overtones Overtones can explain the origin of the major triad and hence its high perceived sonority. The major triad appears early in the sequence, namely overtones 4, 5, 6 (root position) and —even earlier— 3, 4, 5 (second inversion). But it is well-known, that overtones fail to explain the origin of the minor chord. Figure 2. Sinusoids of the major triad. ously, analyzing the frequency spectrum is closely related to analyzing the time domain (periodicity). Fourier transformation allows to translate between both mathematically. However, subjective pitch detection, i.e., the capability of our auditory system to identify the repetition rate (periodicity) of a complex tone sensation, only works for the lower but musically important frequency range up to about 1.500 Hz [3]. In consequence, a missing fundamental tone can be assigned to each interval. The tone with the respective frequency, called virtual pitch of the interval, is not present as an original tone component. It has nothing to do with (first-order) beats and is perceived not directly in the ear, but in the brain. 2.2 Dissonance and Tension Since the origin of harmony and scales cannot be explained well by overtones, newer explanations base upon the notions of dissonance [2,5] and tension [6]. In general, dissonance is the opposite to consonance, meaning how well tones sound together. Although this approach correlates better to the empirical results on harmony perception, it does not explain the low perceived sonority of the diminished or the augmented triad, which are built from two minor or major thirds, respectively. Therefore, [6] adopts the argument from psychology that neighboring intervals of equivalent size are instable and produce a sense of tonal tension, that is resolved by pitch changes leading to unequal intervals. Since lowering any tone in an augmented triad by one semitone leads to a major triad and raising to a minor triad, [6] assumes sound symbolism, where the major triad is associated with social strength and the minor triad with social weakness. But on the contrary, a minor triad becomes a major triad by raising the third. In addition, it is unclear whether suspended triads, built from two perfect fourths, also have a low perceived sonority. Finally, most of the empirical experiments on harmony perception present only single chords to the tested subjects. This means, there is actually no pitch movement at all. 3.1 Periodicity Pitch of Chords For intervals, i.e. two tones, the concept of virtual pitch has been studied many times in the literature (see [3] and references therein). The idea in this paper now is to transfer this concept to chords by considering relative periodicity, i.e. the period length of complex sinusoids relative to the period length of the frequency of the lowest tone component (cf. [7, Sect. 7.1]). For example, the A major triad in just intonation consists of three tones with (absolute) frequencies f1 = 440 Hz, f2 = 550 Hz, and f3 = 660 Hz. The respective frequency ratios wrt. the lowest tone (a′ ) are F1 = 1/1, F2 = 5/4 (third), and F3 = 3/2 (fifth), corresponding to the semitones {0, 4, 7}. Fig. 2 (a)– (c) show the sinusoids for the three pure tone components and Fig. 2 (d) their superposition, i.e. the graph of the function sin(ω1t) + sin(ω2t) + sin(ω3t), where ωi = 2π fi are the respective angular frequencies, and t is the time. As one can see, the period length of the chord is (only) 3. A PERIODICITY-BASED THEORY The approaches discussed so far more or less take the frequency spectrum of a sound as their starting point. Obvi- 88 10th International Society for Music Information Retrieval Conference (ISMIR 2009) have the form 3m /2n for some integers m and n, i.e., they are based on fifths, strictly speaking, a stack of perfect fifths (frequency ratio 3/2), applying octave equivalence. However, although huge numbers appear in the numerators and denominators of the fractions in Pythagorean tuning, the relative errors compared to equal temperament (shown in brackets in Tab. 1) grow up to more than 1%. In fact, the Pythagorean tuning does not follow results of psychophysics, namely that human subjects can distinguish frequency differences for pure tone components only up to a certain resolution, namely 0.5% under optimal conditions. For the musically important low frequency range, especially the tones in (accompanying) chords, this so-called just noticeable difference is worse, namely only below about 1% [3]. Therefore, we should look for tunings, where the relative error is approximately 1%. In addition, the frequency ratios should be simple integer ratios, i.e. fractions with small numerators and denominators. In order to achieve the latter, we can look in the harmonic overtone sequence,when a tone of the chromatic scale appears for the first time, applying again octave equivalence. The result of this procedure, which we will call overtonal tuning, leads to frequency ratios of the form m/2n for some integers m and n as shown in Tab. 1 (c). However, as one can see, the relative error compared to equal temperament again is sometimes high. In the literature (see e.g. [5] and references therein), other historical and modern tunings are listed, e.g. Kirnberger III, see Tab.1 (d). However, they are also only partially useful in this context, because they do not take into account the fact on just noticeable differences explicitly. In principle, this also holds for the adaptive tunings in [5], where simple integer ratios are used and scales are allowed to vary. An adaptive tuning can be viewed as a generalized dynamic just intonation, which fits well to musical practice, because the frequencies for one and the same pitch category may vary significantly during the performance of a piece of music. Trained musicians try to intonate e.g. a perfect fifth with the frequency ratio 3/2, and listeners are hardly able to distinguish this frequency ratio from others that √ are close to the value in equal temperament, namely 12 7 2 ≈ 1.498. In consequence, also the rational tuning, which we introduce now, primarily should not be considered as a tuning, but more as the basis for intonation and perception of intervals. We will use the frequency ratios of the rational tuning, shown in Tab. 1 (e), in our analyses of harmonicity. They are fractions with smallest possible denominator, such that the relative error wrt. equal temperament is just below 1%. They can be computed by means of Farey sequences, i.e. ordered sequences of completely reduced fractions between 0 and 1 which have denominators less than or equal to some (small) n, or by continued fraction expansion. four times the period length of the lowest tone for this example. In the following, we call this ratio h. It depends on the frequency ratios {a1 /b1 , . . . , ak /bk } of the given chord. We assume, that each frequency ratio Fi is a fraction ai /bi (in its lowest terms), because otherwise no finite period length can be found in general, and it holds Fi ≈ fi / f1 for 1 ≤ i ≤ k. This means, all frequencies are relativized to the lowest frequency f1 , and F1 = 1. The value of h then can be computed as lcm(b1 , . . . , bk ), i.e., it is the least common multiple (lcm) of the denominators of the frequency ratios. This can be seen as follows: Since the relative period length of the lowest tone T1 = 1/F1 is 1, we have to find the smallest integer number that is an integer multiple of all relative period lengths Ti = 1/Fi = bi /ai for 1 < i ≤ k. Since after ai periods of the i-th tone, we arrive at the integer bi , h can be computed as the least common multiple of all bi . 3.2 A Hypothesis on Harmony Perception We now set up the following hypothesis on harmony perception: The perceived sonority of a chord, called harmonicity in this context, decreases with the value of h. For the major triad in root position we have h = 4 (see above), which is quite low. Therefore, its predicted sonority is high. This correlates well to the empirical results, in general better than the approaches discussed in the previous section (Sect. 2), as we will see later on (in Sect. 4). In addition, the periodicity-based theory presented here is computationally simple, because it needs no assumptions on parameters, such as harmonic overtone spectra. Neither complex summation nor computing local extrema is required. Only the frequency ratios of the tone components in the chord are needed as input parameters. But we still have to answer the question, which frequency ratios should be used in the computation of h. Since this is done in a special way here, we present this now in more detail. 3.3 Tuning and Frequency Ratios The frequencies for the k-th semitone in equal temperament √ with twelve tones per octave can be computed as fk = 12 2 k · f1 , where f1 is the frequency of the lowest tone. The respective frequency ratios are shown in Tab. 1 (a). The values grow exponentially and not linearly, following the Weber-Fechner law in psychophysics, which says that, if the physical magnitude of stimuli grows exponentially, then the perceived intensity grows only linearly. In equal temperament, all keys sound equal. This is essential for playing in different keys on one instrument and for modulation, i.e. changing from one key to another within one piece of music. Since this seems to be universal, at least in western music, we will adopt the equal temperament as reference system for other tunings. The frequency ratios in equal temperament are irrational numbers (except for the ground tone and its octaves), but for periodicity detection they must be fractions, as mentioned above. Let us thus consider other tunings with rational frequency ratios. The oldest tuning with this property is probably the Pythagorean tuning, shown in Tab. 1 (b). Here, frequency relationships of all intervals 3.4 Continued Fraction Expansion In mathematics, a (regular) continued fraction is an expression as shown in Fig. 3 (a), where the ci are integer numbers that must be positive for i > 0. For a given rational or 89 Poster Session 1 interval k (a) equal temperament (b) Pythagorean prime, unison minor second major second minor third major third perfect fourth tritone perfect fifth minor sixth major sixth minor seventh major seventh octave 0 1 2 3 4 5 6 7 8 9 10 11 12 1.000 1.059 1.122 1.189 1.260 1.335 1.414 1.498 1.587 1.682 1.782 1.888 2.000 1/1 37 /211 9/8 39 /214 81/64 311 /217 36 /29 3/2 38 /212 27/16 310 /215 243/128 2/1 (0.00%) (0.79%) (0.23%) (1.02%) (0.45%) (1.25%) (0.68%) (0.11%) (0.91%) (0.34%) (1.14%) (0.57%) (0.00%) (c) overtonal 1/1 (0.00%) 17/16 (0.29%) 9/8 (0.23%) 19/16 (–0.14%) 5/4 (–0.79%) 21/16 (–1.67%) 23/16 (1.65%) 3/2 (0.11%) 25/16 (–1.57%) 27/16 (0.34%) 7/4 (–1.78%) 15/8 (–0.68%) 2/1 (0.00%) (d) Kirnberger III 1/1 25/24 9/8 6/5 5/4 4/3 45/32 3/2 25/16 5/3 16/9 15/8 2/1 (0.00%) (–1.68%) (0.23%) (0.91%) (–0.79%) (–0.11%) (–0.56%) (0.11%) (–1.57%) (–0.90%) (–0.23%) (–0.68%) (0.00%) (e) rational 1/1 (0.00%) 16/15 (0.68%) 9/8 (0.23%) 6/5 (0.91%) 5/4 (–0.79%) 4/3 (–0.11%) 17/12 (0.17%) 3/2 (0.11%) 8/5 (0.79%) 5/3 (–0.90%) 16/9 (–0.23%) 15/8 (–0.68%) 2/1 (0.00%) Table 1. Table of relative frequencies for different tunings. (a) x ≈ c0 + 1 c1 + c2 + 1 c3 + (b) c0 =⌊x⌋ x0 =x − c0 (c) a−1 =1 b−1 =0 Continued fractions may help us explain the origin of the chromatic twelve-tone scale. For this, we look for a tuning in equal temperament with n tones per octave, such that the perfect fifth in just intonation (frequency ratio 3/2) is approximated as good as possible. Thus, we develop a fraction m/n with 2m/n ≈ 3/2, where m is the number of the semitone representing the fifth. Hence, we have to approximate x = log2 (3/2) ≈ 0.585. In this case, the sequence of convergents is 0/1, 1/1, 1/2, 3/5, 7/12, 24/41, 31/53, . . . , showing m/n = 7/12 as desired, because semitone m = 7 gives the perfect fifth in the chromatic scale with n = 12 tones per octave. 1 1 .. . cn =⌊1/xn−1 ⌋ xn =1/xn−1 − cn a0 =c0 b0 =1 an+1 =an−1 + cn+1 an bn+1 =bn−1 + cn+1 bn Figure 3. Continued fractions and Euclidean algorithm. 4. APPLICATION OF THE THEORY 4.1 Comparison of Different Approaches real number x, the values ci can be computed recursively by the (extended) Euclidean algorithm, stated in Fig. 3 (b), where the floor function ⌊x⌋ is used, which yields the largest integer less than or equal to x. The sequence of the ci induces a sequence of fractions ai /bi , called convergents or fraction expansion of x, which can be computed by the equations in Fig. 3 (c). Continued fractions obey many interesting properties (see [8]), for instance: Let us now apply the periodicity-based theory to common musical chords and correlate the obtained results with empirical results. Tab. 2 shows the perceived and computed relative sonority of basic chord classes (cf. Fig. 1). Tab. 2 (a) shows the ranking for the perceived sonority according to empirical experiments reported in [4], which have been repeated by many others with similar results. Unfortunately, [4] does not consider the suspended triad. Therefore, it is not ranked in the table. Tab. 2 (b) provides the ranking for complex tonalness [2], whose numerical values are shown in brackets. The model according to [2] builds on earlier work [9]. However, especially the dissonance of the augmented triad is not reflected in this model by its calculated tonalness: It appears on rank 2, right after the major triad in root position. Therefore, [2] argues, that this has cultural rather than sensory origin. Tab. 2 (c) shows the ranking wrt. instability [6]. The notion of tension used in this model produces the desired low sonority of the diminished and the augmented triad (cf. Sect. 2.2). The correlation with the empirical results is good, but can still be improved, e.g., the minor triad in root position (rank 2) scores better than the inversions of the major triad (ranks 4 and 5), which is not as desired. Tab. 2 (d)–(e) shows the ranking wrt. the harmonicity values h. As one can see, there is almost a one-to- • Any finite continued fraction represents a rational number. • Every convergent ai /bi of a continued fraction is in its lowest terms, i.e. , ai and bi have no common divisors. • Each convergent is nearer to x than the preceding convergent and also than any other fraction whose denominator is less than that of the convergent. The most important property in this context is the last one, because it provides a procedure for computing the frequency ratios of the rational tuning as follows. For the√k-th semitone, we consider the fraction expansion of x = 12 2 k , i.e. the frequency ratio in equal temperament, until the relative error of the convergent y = an /bn wrt. x, i.e. the term |y/x − 1|, is less than 1%. 90 10th International Society for Music Information Retrieval Conference (ISMIR 2009) (b) tonality [2] (c) instability [6] (d) harmonicity (e) harmonicity∗ 1 2 3 1 6 3 (0.48) (0.38) (0.43) 4 5 6 7 8 9 10 4 7 10 9 5 8 2 (0.42) (0.38) (0.32) (0.35) (0.40) (0.37) (0.44) 1 5 4 8 11 9 2 3 6 12 7 10 13 2 3 1 4 5 6 7 8-9 10-11 13 10-11 8-9 12 2 3 1 4 5 6 7 8 9 13 10 12 11 chord class (a) empirical [4] {0, 4, 7} {0, 3, 8} {0, 5, 9} suspended {0, 5, 7} {0, 2, 7} {0, 5, 10} minor {0, 3, 7} {0, 4, 9} {0, 5, 8} diminished {0, 3, 6} {0, 3, 9} {0, 6, 9} augmented {0, 4, 8} major (0.624) (0.814) (0.780) (1.175) (1.219) (1.191) (0.744) (0.756) (0.838) (1.431) (1.114) (1.196) (1.998) (4) (5) (3) (6) (8) (9) (10) (12) (15) (60) (15) (12) (20) (4.0) (5.0) (3.0) (6.0) (8.0) (9.0) (10.0) (12.0) (15.0) (26.0) (16.6) (19.9) (19.7) Table 2. Ranking relative sonorities of common triads. waveforms is identical with that of its fundamental tone. We obtain h = 1, since the frequencies of harmonic overtones are integer multiples of the fundamental frequency, hence all frequency ratios {1/1, 2/1, 3/1, . . . } have 1 as denominator. Therefore, harmonicity is independent from concrete amplitudes and phase shifts of the sinusoids of the pure tone components. This seems plausible, because harmony perception only partially depends on loudness and timbre of the sound. It should not matter much, whether a chord is played e.g. on guitar, piano, or pipe organ. Of course, this argument only holds for tones with harmonic overtone spectra. If we have inharmonic overtones in a complex tone such as in gamelan music (cf. [5]), then it holds h > 1 for the harmonicity value of a single tone, i.e., we have an inherently increased harmonic complexity (cf. [2]). one correspondence with the empirical results. The numbers in brackets are the respective harmonicity values h and h∗ , where the latter are averaged over all inversions. For this, we compute the harmonicity of the given chord (cf. Sect. 3), e.g. the first inversion of the diminished triad {0, 3, 9}, that is h0 = lcm(1, 5, 3) = 15. In addition, we adopt each tone as reference tone, not only the lowest tone. Thus, we consider also the chords with the semitones {−3, 0, 6} and {−9, −6, 0}. For semitones associated with a negative number n, we take the frequency ratio of semitone 12 − n according to Tab. 1 (e) and halve it, i.e., we do not apply octave equivalence here. Therefore, we get the frequency ratios {5/6, 1/1, 17/12} and {3/5, 17/24, 1/1} with harmonicity values h1 = 12 and h2 = 120, respectively. Since periodicity of chords is related to the lowest tone, we multiply the h values by the lowest frequency ratio in the chord, obtaining h′0 = 15, h′1 = 5/6 · 12 = 10, and h′2 = 3/5 · 120 = 72. We then average the virtual chord frequencies f1 /h, where h appears in the denominator. Hence, we calculate the harmonic average of all harmonicity values h′0 , h′1 , and h′2 , which yields h∗ ≈ 16.6. Tab. 2 (a) and (e) differ only in two respects: First, the most consonant chord according to harmonicity (rank 1) is the second inversion of the major triad with semitones {0, 5, 9} and not the root position. Its calculated harmonicity is h = 3, which however coincides with the fact, that the second inversion appears before the root position in the harmonic overtone sequence (cf. Sect. 2.1). Second, the augmented triad appears late as expected (rank 11 of 13), but the root position and the second inversion of the diminished triad appear still later. However, the continued fraction √ expansion for the tritone (semitone 6, frequency ratio 2), occurring in both triads, yields first 7/5, which is only slightly mistuned. This would lead to a significantly lower h value of the two chords – as desired. Thus, in summary, the periodicity-based approach on harmony perception fits best to empirical results. 4.3 From Chords to Scales The harmonicity value h can be determined for harmonies, consisting of far more than three tones, without any computational problems. Thus, let us apply the formulae from Sect. 3 to general chords and scales. Fig. 4 (a)–(b) shows harmonies with 5 tones, that have low h values. The pentachord Emaj7/9 with h = 8, classically built from a stack of thirds, is standard in jazz music. Alternatively, it may be understood as superposition of the major triads E and B, which are in a tonic-dominant relationship according to classical harmony theory. Fig. 4 (b) shows the pentatonic scale (h = 24), which could alternatively be viewed as the standard jazz chord E6/9. All harmonies shown in Fig. 4 have low, i.e. good harmonicity values h, ranking among the top 5% in their tone multiplicity category. This also holds for the diatonic scale (7 tones, h = 24) and the blues scale (8 tones, h = 24) in Fig. 4 (c)–(d). Furthermore, according to their h∗ value, all church modes, i.e. the diatonic scale and its inversions, rank among the top 11 of 462 possible scales with 7 tones. Therefore, the periodicity-based theory can contribute significantly to the discussion about the origin of scales of western music. There are other mathematical explanations for the origin of scales, e.g. by group theory [10], ignoring however the sensory psychophysical 4.2 Overtones and Periodicity Harmonic overtone spectra are irrelevant for determining relative periodicities. The period length of such complex 91 Poster Session 1 4444 G ¯ ¯ ¯ ¯ ¯ 6¯ ¯ ¯ ¯ ¯ ¯ ¯ 6¯ 4¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ (a) pentachord (b) pentatonics (c) diatonic scale (d) blues scale Figure 4. Harmonies (scales) with more than three tones. 5.2 Summary and Open Questions basis for the musical importance of the perfect fifth. From the good correlation of the periodicity-based theory with the empirical results presented here, one may conclude, that there is a strong psychophysical basis for harmony perception and the origin of musical scales. As underlying principle for this, periodicity detection turns out to be more important than spectral analysis, although cultural and other aspects certainly must not be neglected. The question, how different harmonies cause different emotions or subjective effects like happiness or sadness is not yet answered by this, of course. 5. CONCLUSIONS As we have seen in this paper, harmony perception can be explained well by considering relative periodicities of chords, that can be computed from the frequency ratios of the intervals in the chord. The approach shows a good correlation to empirical studies on perceived sonority. Even the origin of scales can be described with this approach. It is mathematically simple, employing Farey sequences or the Euclidean algorithm for computing continued fractions. The approach has a strong psychophysically basis. It takes into account that human pitch perception is limited by a just noticeable difference of about 1% and assumes that virtual pitch of chords (chord periodicity) can be detected. The latter is indeed possible, as results from neuroscience prove, which we briefly review now. 6. REFERENCES [1] Gerald Langner. Die zeitliche Verarbeitung periodischer Signale im Hörsystem: Neuronale Präsentation von Tonhöhe, Klang und Harmonizität. Zeitschrift für Audiologie, 46(1):8– 21, 2007. [2] Richard Parncutt. Harmony: A Psychoacoustical Approach. Springer, Berlin, Heidelberg, New York, 1989. [3] Juan G. Roederer. The Physics and Psychophysics of Music: An Introduction. Springer, Berlin, Heidelberg, New York, 4th edition, 2008. 5.1 Periodicity and Neuro-Science From a spectral point of view, sounds are combinations of a fundamental frequency and certain overtones. Spectral analysis is performed in the cochlea. When a pure tone is detected, waves travel along the basilar membrane, which the cochlea houses, reaching a maximum amplitude at a point depending on the frequency of the tone [1–3]. Thus, the ear works as a spectral analyzer. This function of the ear is used in the explanations of harmony perception, based on overtones or dissonance (Sect. 2). Periodicity-based explanations use missing fundamental tones, i.e. tones that are physically not present and hence cannot perceived by the ear directly. It has been well-known for years that periodicity can be detected in the brain. For example, two pure tones forming a mistuned octave cause so-called second-order beats, although no exact octave is present [3]. Recently, neuro-science found the mechanism for being able to perceive periodicity. As a result of a combined frequency-time analysis, i.e. some kind of auto-correlation by comb-filtering, pitch and timbre are mapped temporally and also spatially and orthogonally to each other in the auditory midbrain and auditory cortex [1] (see also [11]). [12] reviews neuro-physiological evidence for interspike interval-based representations for pitch and timbre in the auditory nerve and cochlear nucleus. Timings of discharges in auditory nerve fibers reflect the time structure of acoustic waveforms, such that the interspike intervals (i.e. the period lengths) that are produced convey information concerning stimulus periodicities, that are still present in short-term memory [1]. [4] L. A. Roberts. Consonant judgments of musical chords by musicians and untrained listeners. Acustica, 62:163–171, 1986. [5] William A. Sethares. Tuning, Timbre, Spectrum, Scale. Springer, London, 2nd edition, 2005. [6] Norman D. Cook and Takashi X. Fujisawa. The psychophysics of harmony perception: Harmony is a three-tone phenomenon. Empirical Musicology Review, 1(2), 2006. [7] James Beament. How we hear music: The relationship between music and the hearing mechanism. The Boydell Press, Woodbridge, UK, 2001. [8] Carl D. Olds. Continued Fractions, volume 9 of New Mathematical Library. L.W. Singer Company, 1963. [9] Ernst Terhardt, Gerhard Stoll, and Manfred Seewann. Algorithm for extraction of pitch and pitch salience from complex tonal signals. Journal of the Acoustical Society of America, 71(3):679–688, 1982. [10] Gerald J. Balzano. The group-theoretic description of twelvefold and microtonal pitch systems. Computer Music Journal, 4(4):66–84, 1980. [11] Ray Meddis and Michael J. Hewitt. Virtual pitch and phase sensivity of a computer model of the auditory periphery: I. Pitch identification, II. Phase sensivity. Journal of the Acoustical Society of America, 89(6):2866–2894, 1991. [12] Peter A. Cariani. Temporal coding of periodicity pitch in the auditory system: An overview. Neural Plasticity, 6(4):147– 172, 1999. 92