Melody Generation Using An Interactive Evolutionary Algorithm

See discussions, stats, and author profiles for this publication at: https://www.researchgate.
net/publication/334285639
Melody Generation using an Interactive Evolutionary Algorithm
Preprint · July 2019

DOI: 10.13140/RG.2.2.36070.37442
CITATIONS READS
0 73
2 authors:
Majid Farzaneh Rahil Mahdian

Iran Broadcasting University Iran Broadcasting University-IRIBU
8 PUBLICATIONS 3 CITATIONS 16 PUBLICATIONS 7 CITATIONS
SEE PROFILE SEE PROFILE
Some of the authors of this publication are also working on these related projects:
Automatic Music Generation View project
Indoor Camera Positioning based-on Deep neural network View project
All content following this page was uploaded by Majid Farzaneh on 07 July 2019.
The user has requested enhancement of the downloaded file.

Melody Generation using an Interactive
Evolutionary Algorithm
1st Majid Farzaneh 2nd Rahil Mahdian Toroghi
Faculty of Media Technology and Engineering Faculty of Media Technology and Engineering
Iran Broadcasting University Iran Broadcasting University
Tehran, Iran Tehran, Iran
[email protected] [email protected]
Abstract— Music generation with the aid of computers has music from the music sheets, while in latter the musical audio
been recently grabbed the attention of many scientists in the signals are learned. Music generation can also be discussed
area of artificial intelligence. Deep learning techniques have in terms of the difficulty of performing [16, 24], and the
evolved sequence production methods for this purpose. Yet, a narrative [12].
challenging problem is how to evaluate generated music by a
The major questions in this regard are; what kind of music
machine. In this paper, a methodology has been developed based
upon an interactive evolutionary optimization method, with do we prefer to be generated by a machine? Do we need new
which the scoring of the generated melodies is primarily styles to be created, or we want the machine to resort and
performed by human expertise, during the training. This music imitate the existed music? How the quality of the generated
quality scoring is modeled using a Bi-LSTM recurrent neural music could be evaluated by machine? Are we able to
network. Moreover, the innovative generated melody through a generate new music analogous to the manuscripts of a famous
Genetic algorithm, will then be evaluated using this Bi-LSTM musician by machine, and how this similarity could be
network. The results of this mechanism clearly show that the certified?
proposed method is able to create pleasurable melodies with In this study, our assumption is that human will judge the
desired styles and pieces. This method is also quite fast,
quality of the generated melody, and will give them scores.
compared to the state-of-the-art data-oriented evolutionary
systems. Thereafter, the human scoring will be modeled using a Bi-
directional Long-Short-Term Memory (Bi-LSTM) neural
Keywords— Melody generation, Evolutionary algorithm, Bi- network. This neural network-based system will supersede
LSTM neural network. the human scoring system and will perform as a standard
evaluator of the melody generated by optimization-based
I. INTRODUCTION
models. The proposed music generation system of our paper
Music is a ubiquitous, undeniable, and perhaps the most is based on Genetic algorithm, which performs as a note-
influential part of media content. It can easily facilitate, based method to create melodies.
transferring of the emotion and concepts in an artistic, and The novelty of this paper is two-fold. First, the interaction
delicate way. This motivates the music accompaniment with between human and machine in order to generate new
all media types and human-oriented places, such as movies, meaningful and pleasurable melodies, and second is
theaters, games, shops, and so on in order to bring human introducing a new scoring model in order to evaluate the
pleasure. generated melodies.
For a machine to create automatic enjoyable music it The rest of the paper is arranged as follows. In section II,
should imitate the rules, know-how, and subtleties embedded the proposed method has been represented in three phases. In
in a pleasurable melody or a famous masterpiece of an artist. section III, the experimental results have been provided and
This could be extracted from a rich database of artistic music then analyzed. The paper is then terminated by our conclusion
being played by famous musicians and then being learned to in chapter IV, followed by the cited references.
the machine.
II. THE PROPOSED MELODY GENERATION SYSTEM
There are several methods introduced so far in order to
generate musical melodies automatically, such as Hidden The proposed method consists of three major phases.
Markov Models [21, 5, 19, 7, 16, 20, 2], models based on First, a Genetic algorithm (GA) is used to generate a vast
artificial neural networks [23, 18, 8, 3, 4, 10, 14, 26], models spectrum of melodies, from bad ones up to pleasurable ones.
based on the evolutionary and population-based optimization The bad melodies are the ones that are made randomly when
algorithms [13, 25, 11, 22], and models based on local search GA starts. In the second phase, the outputs of GA are given
algorithms [6, 9]. Recently, the sequential deep neural to humans to be scored from zero to 100. Then, these
networks especially Long Short-Term Memory (LSTM) generated and scored melodies are trained by a Bi-LSTM
neural networks have become prevalently used and achieved neural network. This network, when trained by a sufficient
successful results generating time series sequences [15, 1, amount of musical data and associated scores, would perform
17]. like a performance evaluator of the GA music generator
Music generation can be viewed from different aspects. system. In the end, the GA music which can maximize the Bi-
One can focus on melody generation, while others can work LSTM output as the objective function is played, as the
specifically on harmony and rhythm. In terms of data, music generated pleasurable melody.
generation methods could also be divided into note-based and
signal-based methods. In former, the machine should learn
A. Phase I: Generating Training Melodies
As already mentioned, this interval involves providing the The mutation is applied, as follows,
necessary training data (i.e. melodies) for the next phase. for i=1 to D do
Genetic algorithm has been chosen for this task. The reason if rand>0.1 then
is that GA can generate a population of melodies child(i)=best1(i);
(chromosomes) randomly, with a vast spectrum of qualities. else
These melodies are generated in the form of ABC notations 1, child(i)=random_melody(i);
end if
and then at each iteration, the best melodies among the end for
population are selected based on their similarity to the human
made melodies, which are taken from the manuscripts of the During GA generation, we save all the generated melodies
most famous musicians. Therefore, a database of melodies for the scoring purposes and making the training data for the
has to be used as part of the fitness function. Then, the second phase.
number of 2-gram, 3-gram, and 4-gram structures in The melody generation system tries to produce similar to
generated melodies are computed, which also exist in the the melodies. The execution time of melody generation is
database. Then, the fitness function is calculated as, high since it should compare every generated melody with a
large dataset note-by-note, and three times. What we desire is
𝐹𝑖𝑡𝑛𝑒𝑠𝑠(𝐶𝑖 ) = 𝑁2 + 10𝑁3 + 100𝑁4 (1) to generate new melodies which are pleasurable for the
audience, furthermore to be similar to the existing melodies.
Where Ci is ith chromosome and N2, N3 and N4 are the To achieve this, we need another phase to be implemented.
numbers of 2-grams, 3-grams, and 4-grams respectively. B. Phase II: LSTM-based Evaluating Model
If we set the database for a specific genre, the generated
music will be very much similar to that specific genre, as To simulate the human scoring mechanism, a Bi-LSTM
well. To make this stronger, we calculate the probability of neural network has been used. The reason is that the melodies
each character (i.e., the characters in ABC notation regime are somewhat a reasonable and sensible sequence of the
which are going to be played) in the dataset as in the musical notes being chained to represent a meaningful and
following equation, pleasurable sensing atmosphere to the audience. This
sequence justifies using recurrent neural networks. LSTM
𝑁𝑖 neural network is a solution to that. Bidirectional LSTM (aka
𝑃(𝑐ℎ𝑎𝑟𝑎𝑐𝑡𝑒𝑟𝑖 ) = (2) Bi-LSTM) is an LSTM whose parameters represent the
𝑁𝑡𝑜𝑡𝑎𝑙
forward and backward correlations of the adjacent notes or
Where Ni is the number of occurrence of characteri in the frames of the musical signal, both.
entire database and Ntotal is the total number of characters in For a generated melody to be qualified as pleasurable, a high
the database. Hereafter, whenever GA generates new score should be assigned to it by human assessment. This
melodies, it makes characters occur according to these score could be within 0 to 100 boundary values. Therefore,
probabilities. For example, if note G repeated 100 times in we have used the melodies as input to the Bi-LSTM network
database, and there are 5000 characters overall in the and the average of the scores as the output of it. This GA
database, the probability of character G will be 0.02 and if based generated melodies would never be obtainable again
GA goes to generate 100 characters for a random melody, 2 since they are directly used as the training data for the
characters among those 100 should be G. Therefore, we lead network. The Bi-LSTM, after being sufficiently trained
GA makes melodies more like the database. would be a qualified evaluation system for the quality and
The crossover and mutation are applied, then on selected pleasurability of the generated musical melodies.
chromosomes (melodies) at each iteration. The crossover is C. Phase III: The Proposed Music Generation System
applied as follows,
This phase is the same as the first one, except that the
for i=1 to D do objective function being used is different. The flowchart of
if rand>0.5 then the proposed music generation system is depicted in figure 1.
child(i)=best1(i);
else
child(i)=best2(i);
end if
end for
Where D is the chromosome length, rand is a uniform
random number between 0 to 1, best1 and best2 are the
selected chromosomes and child is a new chromosome after
crossover.
1
For more information please refer to
www.abcnotation.com
2
settings of the genetic algorithms in these two phases are
depicted in table 3.
Table 1: Initial settings
Parameter Phase I Phase III
Maximum iteration 2000 500
Population size 20 20
Crossover rate 0.5 0.5
Mutation rate 0.1 0.1
Similarity to Campin Bi-LSTM Neural

Fitness function
dataset Network
The experimental settings associated with the Bi-LSTM

neural network, are as follows:
 The Bi-LSTM hidden layer size is 50
 Number of epochs for the training phase is 5000
 The cost function being used in Bi-LSTM
training is the mean square of error (mse)
 Melody size for both phases (I and III) contain
30 notes
During the human evaluation, we generated 6 melodies
through phase I, and 6 melodies through phase III, and then
we asked 20 audiences to give these melodies their scores
within 0 to 100. These 12 melodies are available on YouTube
Figure 1: The entire proposed music generation system. The GA
to listen using the link below:
algorithm performs as the generator system, both for training the https://www.youtube.com/watch?v=Ci6DHEwAYcQ&f
evaluator system and for the final generation system. Using the eature=youtu.be
automatic evaluator, the produced melodies are chosen among the Figure 3 shows the mean of the scores taken from these
most pleasurable ones created from the GA population.
20 audiences and the standard deviation of the scores for each
III. EXPERIMENTAL RESULTS melody.
The proposed architecture, shown in figure 1, has been
implemented in MATLAB framework. For phase I, the Subjective Evaluation
Campin dataset 2 which contains 200 different European
melodies in ABC notation has been used. Figure 2 shows the 100
note probabilities in this dataset.
80
60
Score
0.18
0.16 40
NOTE PROBABILITY
0.14
20
0.12
0.1
0
0.08
0.06
Melody
0.04
0.02
mean std
0
Figure 3: Subjective evaluation of the pleasurability of the melodies,
A B C D E F G a b c d e f g z scored by 20 audiences
NOTE
Even though the genetic algorithms of phase I, and phase III
Figure 2: Notes vs. their probability of occurrence in Campin are pretty similar, the execution time of these two phases
(European melodies) dataset. Notes are represented in ABC notation differ drastically. For the training phase, the execution time
standard.
is quite dependent on the number of notes it is going to
As already mentioned, the GA algorithm in phase I and generate, whereas in generation phase (Phase III) the
III are almost the same, except for the performance function execution time is almost independent of the number of notes
they are going to optimize which is different. The initial it is going to generate and it remains constant. The figure
2
http://abcnotation.com/tunes
3
clearly shows this property. This could be explained due to
the heavy search the algorithm is going to do note-by-note [5] Brooks, F.P., Hopkins, A., Neumann, P.G., Wright,
during the training phase, as explained before. W.V.: An experiment in musical composition. IRE
Transactions on Electronic Computers (3), 175{182
(1957)
[6] Browne, T.M., Fox, C.: Global expectation-
violation as fitness function in evolutionary
composition. In: Workshops on Applications of
Evolutionary Computation. pp. 538{546. Springer
(2009)
[7] Davismoon, S., Eccles, J.: Combining musical
constraints with markov transition probabilities to
improve the generation of creative musical
structures. In: European Conference on the
Applications of Evolutionary Computation. pp.
361{370. Springer (2010)
[8] Eck, D., Schmidhuber, J.: A first look at music
composition using lstm recurrent neural networks.
Istituto Dalle Molle Di Studi Sull Intelligenza
Artificiale 103 (2002)
Figure 4: The execution time at phase I, and phase III (Bi-LSTM) vs. the [9] Herremans, D.: Morpheus: automatic music
number of notes
generation with recurrent pattern constraints and
tension profiles (2016)
IV. CONCLUSION [10] Herremans, D., Chuan, C.H.: Modeling musical
In this paper, the problem of automatic music generation context with word2vec. arXiv preprint
based on interactive (human and machine) evolutionary arXiv:1706.09088 (2017)
optimization method (here Genetic Algorithm) has been [11] Herremans, D., Sorensen, K.: Composing first
proposed. The goal of the architecture is to enable a machine species counterpoint with a variable neighborhood
to generate melodies which are pleasurable, and meaningful. search algorithm. Journal of Mathematics and the
Initially, the evaluation of what machine generates is Arts 6(4), 169{189 (2012)
performed by the listeners, however, when a Bi-LSTM neural [12] Herremans, D., Sorensen, K.: Composing fifth
network learns ow the human evaluates the pleasurable species counterpoint music with a variable
property of the melodies, it supersedes the human and does neighborhood search algorithm. Expert systems
the task independently. Finally, the GA algorithm generates with applications 40(16), 6427{6437 (2013)
a population of melodies, among which those with highest [13] Horner, A., Goldberg, D.E.: Genetic algorithms and
scores (which are evaluated by the neural network) are computer-assisted music composition. In: ICMC.
chosen to be released. The experiments have been performed vol. 91, pp. 479{482 (1991)
on Campin dataset, and the results have been satisfying [14] Makris, D., Kaliakatsos-Papakostas, M., Karydis, I.,
(pleasurable and consistent) for the audiences in a subjective Kermanidis, K.L.: Combining lstm and feed forward
evaluation study. The generation process has been fairly fast, neural networks for conditional rhythm
which makes this method outperform the data-based composition. In: International Conference on
evolutionary systems. Engineering Applications of Neural Networks. pp.
570{582. Springer (2017)
REFERENCES [15] Manzelli, R., Thakkar, V., Siahkamari, A., Kulis, B.:
[1] Agarwal, S., Saxena, V., Singal, V., Aggarwal, S.: An end to end model for automatic music
Lstm based music generation with dataset generation: Combining deep raw and symbolic
preprocessing and reconstruction techniques. In: audio networks. In: Proceedings of the Musical
2018 IEEE Symposium Series on Computational Metacreation Workshop at 9th International
Intelligence (SSCI). pp. 455{462. IEEE (2018) Conference on Computational Creativity,
[2] Agres, K., Herremans, D., Bigo, L., Conklin, D.: Salamanca, Spain (2018)
Harmonic structure predicts the enjoyment of [16] McVicar, M., Fukayama, S., Goto, M.:
uplifting trance music. Frontiers in psychology 7, Autoleadguitar: Automatic generation of guitar solo
1999 (2017) phrases in the tablature space. In: 2014 12th
[3] Agres, K.R., DeLong, J.E., Spivey, M.: The sparsity International Conference on Signal Processing
of simple recurrent networks in musical structure (ICSP). pp. 599{604. IEEE (2014)
learning. In: Proceedings of the Annual Meeting of [17] Mishra, A., Tripathi, K., Gupta, L., Singh, K.P.:
the Cognitive Science Society. vol. 31 (2009) Long short-term memory recurrent neural network
[4] Boulanger-Lewandowski, N., Bengio, Y., Vincent, architectures for melody generation. In: Soft
P.: Modeling temporal dependencies in high- Computing for Problem Solving, pp. 41{55.
dimensional sequences: Application to polyphonic Springer (2019)
music generation and transcription. arXiv preprint [18] Music, A.: Creation by refinement and the problem
arXiv:1206.6392 (2012) of algorithmic music composition. Music and
4
connectionism p. 212 (1991) [23] Todd, P.M.: A connectionist approach to
[19] Pachet, F., Roy, P., Barbieri, G.: Finite-length algorithmic composition. Computer Music Journal
markov processes with constraints. In: Twenty- 13(4), 27{43 (1989)
Second International Joint Conference on Artificial [24] Tuohy, D.R., Potter, W.D.: A genetic algorithm for
Intelligence (2011) the automatic generation of playable guitar
[20] Papadopoulos, A., Roy, P., Pachet, F.: Avoiding tablature. In: ICMC. pp. 499{502 (2005)
plagiarism in Markov sequence generation. In: [25] WASCHKA II, R.: Composing with genetic
Twenty-Eighth AAAI Conference on Artificial algorithms: Gendash. In: Evolutionary Computer
Intelligence (2014) Music, pp. 117{136. Springer (2007)
[21] Pinkerton, R.C.: Information theory and melody. [26] Wu, J., Hu, C., Wang, Y., Hu, X., Zhu, J.: A
Scientific American 194(2), 77{87 (1956) hierarchical recurrent neural network for symbolic
[22] Scirea, M., Togelius, J., Eklund, P., Risi, S.: melody generation. arXiv preprint
Affective evolutionary music composition with arXiv:1712.05274 (2017)
meta-compose. Genetic Programming and
Evolvable Machines 18(4), 433{465 (2017)
View publication stats

Melody Generation Using An Interactive Evolutionary Algorithm

Uploaded by

Copyright:

Available Formats

Melody Generation Using An Interactive Evolutionary Algorithm

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Melody Generation Using An Interactive Evolutionary Algorithm

Uploaded by

Copyright:

Available Formats

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

Melody Generation using an Interactive Evolutionary Algorithm

Preprint · July 2019

Majid Farzaneh Rahil Mahdian

SEE PROFILE SEE PROFILE

Automatic Music Generation View project

Indoor Camera Positioning based-on Deep neural network View project

The user has requested enhancement of the downloaded file.

Maximum iteration 2000 500

Crossover rate 0.5 0.5

Mutation rate 0.1 0.1

Similarity to Campin Bi-LSTM Neural

The experimental settings associated with the Bi-LSTM

View publication stats

You might also like