Ijcnn2011_Andres Rev 07.07.11

7
Generation of composed musical structures through recurrent neural networks based on chaotic inspiration Andr´ es E. Coca, Roseli A. F. Romero, and Liang Zhao Abstract— In this work, an Elman recurrent neural network is used for automatic musical structure composition based on the style of a music previously learned during the training phase. Furthermore, a small fragment of a chaotic melody is added to the input layer of the neural network as an inspiration source to attain a greater variability of melodies. The neural network is trained by using the BPTT (back propagation through time) algorithm. Some melody measures are also presented for characterizing the melodies provided by the neural network and for analyzing the effect obtained by the insertion of chaotic inspiration in relation to the original melody characteristics. Specifically, a similarity melodic measure is considered for contrasting the variability obtained between the learned melody and each one of the composite melodies by using different quantities of inspiration musical notes. I. I NTRODUCTION Recurrent neural networks (RNN) considers feedbacks and delay operators, which allows to model the nonlinearity and dynamical components of a system. They are very useful in the analysis and modeling of time series, such as music. Therefore, we can find many works that applies recurrent neural networks for automatical music composition. In 1989, Todd [1] utilized a neural network trained by BPTT algorithm, for composing monophonic melodies, in which a note occurs after other as a sequential phenomenal. In [2], it was developed a paradigm that utilizes neural network for the generation of the melodies by refinement (CBR - creation by refinement), in which the network was trained for performing a musical critic to judge musical examples according to specified criterion. Another interesting system for musical composition is the CONCERT system, proposed by Mozer [3]. In that work, a recurrent neural network was trained based on the psycho- logical information, that is, similar notes are represented by similar representation in the network. The system presents a peculiar characteristic in which the generated notes are candidates to belong to the melody according to a probability distribution. It is observed that in the past years, the methods for musical composition were evolving and turning themselves a mixture of computational methods, as it can be found in [4]. In that work, it was used a recurrent neural network, SRN - simple recurrent networks, in which the weights are adapted according to a genetic algorithm. Andr´ es E. Coca, Roseli A. F. Romero, and Liang Zhao are with the Institute of Mathematics and Computer Science, University of S˜ ao Paulo, ao Carlos - SP, Brazil, email:{aecocas, rafrance, zhao}@icmc.usp.br. The authors would like to thank Capes, CNPq and FAPESP by financial supports for this research. Recently, hybrid methods have been proposed as in [5], in which neural networks and Markov Chain are used for musical composition through examples. In [6], it was used also a neural network with delay time and probabilistic finite state machines for acquiring knowledge by inductive learning for multiple musical instruments. In [7], a recurrent neural network was proposed to model the multiple threads of a song that are functionally related, such as the functional relationship between the instruments and drums. Finally, in [8], a probabilistic neural network based on Boltzmann machines was proposed for melodic improvisation of jazz on the sequential accords. It is possible to obtain a similar melody from an original training melody by using a neural network to learn the characteristics or style of a input melody. It is also interesting to control the similarity between the original melody and the generated melody (or output melody of the neural net- work). In this case, it is necessary to consider an additional independent melody. In [10], an inspiration source based on the texture of the geographic landscapes is used. However, this method has some disadvantages. The first one is the necessity of the creation and storage of images of a data set, that occupy a large quantity of memory; the second one is about the necessity of a preprocessing of each image, which is time consuming. One of the main characteristics of the chaotic dynamic systems is the high sensitivity to the initial conditions, which is a capacity of a dynamical system to evolve in a different or even a totally unexpected way due to small changes in the initial condition of the system. During the last few decades, this property woke up great interest in different areas of the research, including the musical composition. In this work, we propose to use melodies generated by a chaotic dynamical system as inspiration source. By using this technique, it is not necessary to have a large dataset for getting a big number of inspiration melodies. Due to the characteristics of chaotic systems, it is possible to get infinite variations through a arbitrarily small changing in the parameters of the system. Furthermore, we can get rich musical materials in a very fast manner. For this purpose, we use the algorithm proposed by Coca et. al. [11] for chaotic musical composition as it will be shown later. The rest of this paper is organized as it follows. In section II, the theoretical foundations of Elman neural networks and the chaotic inspiration are presented. In section III, we describe the algorithm for generation of chaotic musical inspiration. In section IV, we present the melodic measures used in this work in order to analyze and to compare the

Transcript of Ijcnn2011_Andres Rev 07.07.11

Page 1: Ijcnn2011_Andres Rev 07.07.11

Generation of composed musical structures through recurrentneural networks based on chaotic inspiration

Andres E. Coca, Roseli A. F. Romero, and Liang Zhao

Abstract— In this work, an Elman recurrent neural networkis used for automatic musical structure composition based onthe style of a music previously learned during the trainingphase. Furthermore, a small fragment of a chaotic melody isadded to the input layer of the neural network as an inspirationsource to attain a greater variability of melodies. The neuralnetwork is trained by using the BPTT (back propagation throughtime) algorithm. Some melody measures are also presented forcharacterizing the melodies provided by the neural network andfor analyzing the effect obtained by the insertion of chaoticinspiration in relation to the original melody characteristics.Specifically, a similarity melodic measure is considered forcontrasting the variability obtained between the learned melodyand each one of the composite melodies by using differentquantities of inspiration musical notes.

I. INTRODUCTION

Recurrent neural networks (RNN) considers feedbacks anddelay operators, which allows to model the nonlinearityand dynamical components of a system. They are veryuseful in the analysis and modeling of time series, suchas music. Therefore, we can find many works that appliesrecurrent neural networks for automatical music composition.In 1989, Todd [1] utilized a neural network trained by BPTTalgorithm, for composing monophonic melodies, in which anote occurs after other as a sequential phenomenal. In [2],it was developed a paradigm that utilizes neural networkfor the generation of the melodies by refinement (CBR -creation by refinement), in which the network was trainedfor performing a musical critic to judge musical examplesaccording to specified criterion.

Another interesting system for musical composition is theCONCERT system, proposed by Mozer [3]. In that work, arecurrent neural network was trained based on the psycho-logical information, that is, similar notes are represented bysimilar representation in the network. The system presentsa peculiar characteristic in which the generated notes arecandidates to belong to the melody according to a probabilitydistribution.

It is observed that in the past years, the methods formusical composition were evolving and turning themselves amixture of computational methods, as it can be found in [4].In that work, it was used a recurrent neural network, SRN -simple recurrent networks, in which the weights are adaptedaccording to a genetic algorithm.

Andres E. Coca, Roseli A. F. Romero, and Liang Zhao are with theInstitute of Mathematics and Computer Science, University of Sao Paulo,Sao Carlos - SP, Brazil, email:{aecocas, rafrance, zhao}@icmc.usp.br.

The authors would like to thank Capes, CNPq and FAPESP by financialsupports for this research.

Recently, hybrid methods have been proposed as in [5],in which neural networks and Markov Chain are used formusical composition through examples. In [6], it was usedalso a neural network with delay time and probabilistic finitestate machines for acquiring knowledge by inductive learningfor multiple musical instruments. In [7], a recurrent neuralnetwork was proposed to model the multiple threads of asong that are functionally related, such as the functionalrelationship between the instruments and drums. Finally,in [8], a probabilistic neural network based on Boltzmannmachines was proposed for melodic improvisation of jazzon the sequential accords.

It is possible to obtain a similar melody from an originaltraining melody by using a neural network to learn thecharacteristics or style of a input melody. It is also interestingto control the similarity between the original melody andthe generated melody (or output melody of the neural net-work). In this case, it is necessary to consider an additionalindependent melody. In [10], an inspiration source based onthe texture of the geographic landscapes is used. However,this method has some disadvantages. The first one is thenecessity of the creation and storage of images of a data set,that occupy a large quantity of memory; the second one isabout the necessity of a preprocessing of each image, whichis time consuming.

One of the main characteristics of the chaotic dynamicsystems is the high sensitivity to the initial conditions, whichis a capacity of a dynamical system to evolve in a differentor even a totally unexpected way due to small changes in theinitial condition of the system. During the last few decades,this property woke up great interest in different areas of theresearch, including the musical composition.

In this work, we propose to use melodies generated bya chaotic dynamical system as inspiration source. By usingthis technique, it is not necessary to have a large datasetfor getting a big number of inspiration melodies. Due tothe characteristics of chaotic systems, it is possible to getinfinite variations through a arbitrarily small changing inthe parameters of the system. Furthermore, we can get richmusical materials in a very fast manner. For this purpose, weuse the algorithm proposed by Coca et. al. [11] for chaoticmusical composition as it will be shown later.

The rest of this paper is organized as it follows. In sectionII, the theoretical foundations of Elman neural networksand the chaotic inspiration are presented. In section III,we describe the algorithm for generation of chaotic musicalinspiration. In section IV, we present the melodic measuresused in this work in order to analyze and to compare the

Page 2: Ijcnn2011_Andres Rev 07.07.11

melodic similarity and complexity. Some simulation resultsobtained are showed in section V. Finally, section VI presentsthe conclusions and future works.

II. RECURRENT NEURAL NETWORK WITH CHAOTICINSPIRATION

Elman Networks are recurrent neural networks where thefeedback occurs from the output of each neuron of thehidden layer for all the neurons of the same layer. Anotherlayer, called the context layer, simulates the memory ofthe network. Figure 1 shows the modified Elman networkwhich contains an additional input, which is generated by analgorithm of chaotic musical composition.

Fig. 1. Recurrent Neural network with chaotic inspiration

The processing of the network consists of the followingevents: at iteration 0 (initial), the signal is propagated by thenetwork. After that, the context units initiate with the outputof the hidden layer with value 0, in order to not influencethe output of the network. That is, in the first iteration, thenetwork will behave like a feed-forward network [12]. Atiteration t, the hidden neurons will activate the neurons in thecontext layer and these will store the output of this iterationthat will be used in the next cycle. Then the backpropagationalgorithm is applied for the correction of the synaptic weightsW, with exception to the recurrent synapses that are fixedin the value 1. At the next iteration, t + 1, this process isrepeated. from now on, there is a difference: the hiddenneurons will be activated by both the input units and thecontext layer units, which have the value of the hiddenneuron outputs at instant t. The output the neural network isgiven by Equation 1

x (t) = Wxxx (t− 1) +Wxuu (t− 1) +Wxuz (t− 1)y (t) = Wyxx (t)

(1)where, x (t− 1) are the outputs of the hidden neurons, Wxx

are the associated synaptic weights, y (t) are the networkoutputs, u (t− 1) are the network inputs and z (t− 1) are

the inputs of chaotic inspiration generated for the algorithmof composition described in section III.

The representation by cycles of thirds is used to representthe input data in the form of bits [9]. This representation usesseven bits, in which the first four bits indicate in what cycleof major third the note is located, for a total of four cycles.The last three bits indicate in what cycle of minor third thenote is located, for a total of three cycles. The information ofoctave is given separately, with two additional neurons, oneto indicate if the octave is of C1 until B3, other to indicate ifthe octave is of C5 until B5. If the two neurons have valuesequal to zero, then the octave is of C3 until B3. Figure 2shows the four cycles of major third used and the three circlesof minor thirds. Each cycle is read in the anti-clock direction.

Fig. 2. Major and minor thirds circles [9].

III. CHAOTIC INSPIRATION ALGORITHM

The algorithm developed in [11] uses the numerical so-lution of a nonlinear dynamical system, which consists ofthree variables x(t), y(t), and z(t). The first variable x(t)is associated with the extraction of musical pitches for theinput inspiration. Data transformations of this variable aredescribed in the following subsections.

A. Musical Scale Specifications

Next, the initial inputs to the chaotic musical compositionalgorithm are described.

1) Number of Octaves k: indicates the extent of the scalein octaves, where k ∈ N and 0 < k ≤ 7.

2) Tonic Υτ,o: is the initial tone where a scale starts.It is defined as a pair of variables (τ, o), whereτ : {τ ∈ N |1 ≤ τ ≤ 12} is the tone and o :{o ∈ N |o ≤ k } is the number of the octave of the userdefined scale.

3) Mode m0: is a value within the range 0 < m0 ≤ m,where m ∈ [0, 11] is the maximum number of possiblemodes for a given scale and m0 indicates the numberof required shifts in which a scale starts with a tonegiven by Υτ,o.

4) Structure of the Scale ψξ: a set of interval generatorsthat form the architecture of the musical scale ξ. Itis represented by the set ψξ = (s, t, tm), where s ∈[0, 12] is the number of semitones, t ∈ [0, 6] is thenumber of tones, and tm ∈ [0, 4] is the number of one-half tones that constitute the structure of the musicalscale ξ with n notes.

Page 3: Ijcnn2011_Andres Rev 07.07.11

5) Tone Division ∆: represents the number of divisionsby which the diatonic tone of the scale is divided. Forinstance, 2 divisions (∆ = 2) are used by the temperedsystem of 12 notes, 3 divisions are used by the thirdtone scale, 4 divisions are used by the quarter tonescale, and so on.

B. Extraction of Frequencies and Musical Notes

The extraction of frequencies and musical notes is dividedinto three steps. In the first step, it is generated a membershipbinary vector indicating whether each musical note belongsto the scale ξ chosen by the users. In the second step, anormalization of the variable x(t) is performed, and in thefinal step, the normalized data is mapped into the scaleintervals.

Step 1: Scale GenerationWith the tuning factor λ (the inverse of the number of

tone divisions, λ = 1/∆) and the number of octaves p, avector r of dimension (p+ 1) can be constructed. It containsthe frequency ratios of an equal temperament scale and isgenerated by the following geometric sequence:

ri = 2(i−1)λ

6 , 0 < i ≤ p+ 1 (2)

With the structure ψξ = (s, t, tm), the components of thebinary membership vector v of the desired scale ξ (the scaleξ is previously defined and stored in a file) can be determinedas it follows:

vi =

{1, if ξi ∈ Ξ0, if ξi /∈ Ξ

(3)

The components of the binary membership vector v indi-cate whether a desired scale ξ is contained in the chromaticscale Ξ. In other words, v acts as a filter on the vector r,keeping only those intervals belonging to the user definedscale ξ and eliminating the others. The result of this operationis the vector e,

ei = vi · ri, 0 < i ≤ p+ 1 (4)

Finally, the vector g with dimension n is obtained (n ≤p+ 1 is the number of musical notes of the scale ξ), whichis constructed by eliminating all zero entries of the vector e,defined as:

g = {ei |ei = 0,∀i} (5)

It can be seen that g contains only the frequency ratios ofthe specific musical scale ξ.

Step 2: Variable NormalizationThe variable x(t) is selected to generate the frequencies

of the musical notes. All values of x(t) in the choseninterval should be normalized with respect to the vector g,because the range of the variable x(t) is different from therange of the vector g. This normalization process, defined asxn (t) = γ (x(t),g), consists of a scaling and a translationof the numerical solution of the variable x(t) adapted to thevector g, obtaining the normalized variable xn (t).

This normalization process is responsible for the adjust-ment of the maximum and minimum values between thetwo groups of data, i.e., max (xn (t)) = max (g) andmin (xn (t)) = min (g), while keeping the proportion ofintermediate data. The normalization is defined as:

xn (t) = αx(t) + β (6)

where α is a scaling factor, calculated by the followingformula (note that max (g) = 2k and min (g) = 1):

α =2k − 1

max (x(t))−min (x(t))(7)

Similarly, the variable β is a translation factor and it isdetermined by the following equation:

β = −αmin (x(t)) + min (g) = −αmin (x(t)) + 1 (8)

In this way, we get a variable in the range 1 ≤ xn (t) ≤ 2k.Step 3: Mapping to the Closest ValueOnce the normalized variable xn (t) is obtained, for each

value xn (t), the closest value in the vector g, getting a matchbetween xn (t) and the notes of the given musical scale ξ.Next, a matrix D of dimension cxn×n is built, such that cxn

represents the number of elements of a piece of numericalsolution of xn (t) and n is the number of musical notes. Thismatrix is constructed according to Eq. (9) with the indices iand j in the ranges 0 < i ≤ cxn and 0 < j ≤ n, respectively.

Di,j =

{0, if

∣∣xnj (t)− gi∣∣ ≤ µ

xnj (t), if∣∣xnj (t)− gi

∣∣ > µ(9)

The threshold value µ depends on the type of tuning factorλ, which is calculated by

µ = 2λ6 − 1 (10)

Then we generate a new vector h of size cxn , which holdsthe position of the minimum value of each row of the matrixD, so

hi = col(minDi), 0 < i ≤ cxn (11)

where Di is the ith row of matrix D and col(minDi)returns the column index of the minimum value of ith row ofD. In other words, the vector h contains the column indexesof D rather than the values of D.

The knowledge of tonic is required for the conversion ofthe variable xn (t) to the musical space. The tonic frequencyδτ,o with the musical tone τ in the octave o can be obtainedas follows:

δτ,o = 55 · 2τ+12o−10

12 (12)

With the indexes of h and frequency of tonic δτ,o, we cancalculate the frequencies of musical notes corresponding tothe variable xn (t) by using

fi = δτ,o · ghi , 0 < i ≤ cxn (13)

Page 4: Ijcnn2011_Andres Rev 07.07.11

In order to view the score of the melody, these frequenciesmust be converted to the values of musical notes of thestandard MIDI (Musical Instrument Digital Interface). Forthis purpose, we use Eq. (14), where fi is a frequency in Hzand xi is a MIDI value.

xi= 69+12

log 2log

(|Fi|440

)(14)

Making the conversion to standard MIDI by using thevalues of frequencies f , a vector x of dimension cxn isobtained, which contains the numbers of pitches generatedby x(t). The whole melody generating process by using achaotic system is summarized in Fig. 3. It should be notedthat only part of the diagram is used in this paper, whichincludes all the steps on the left side of the diagram, thestep “Generate Matrix of Notes”and the step “Write to theMIDI file”.

Fig. 3. Diagram of chaotic algorithm

IV. MELODIC MEASURES

In this section, the measures to quantify the characteristicsof the composed melodies for the network are described. It isuseful to find the relation between the musical characteristicsof the melody generated by the network and the quantity ofinspiration notes.

A. Expectancy-based model of melodic complexity

This model is based on the melodic complexity expectancybecause the components of the model are derived fromthe melodic expectancy theory [14]. This measure has asreference the collection of Essen which has a complexitymean of 5 with a standard deviation of 1 [15]. The melodiccomplexity can be tonal (CBMP), rhythmic (CBMR) orjoint (pitch and rhythm, CBMO). These values have beenfound for the corresponding scales of predictability given bylisteners in experiments [16].

B. Melodic originality

In the studies performed by Dean Keith Simonton between1984 and 1994, after analyzing a great amount of classicthemes, he concluded that the originality of the themesis connected to their complexity. The relationship betweenoriginality and complexity has an inverted U shape, i.e., thesimplest and the most complex melodies are less originalityand melodies with average complexity have the highestoriginality. In addition, the originality is directly proportionalto the popularity. In other words, the themes most popularhave medium originality. Consequently, the most simpleand the most complex themes are not considered popular.Accordingly, the originality is lower.

The output of the model is the inverse of the averagedprobability, scaled in the interval from 0 to 10, where highervalue indicates higher melodic originality (MOM) [16].

C. Melodic similarity

This measure is indicated to quantify the similarity be-tween melodic motives, phrases and musical segments [17].The distance can be calculated from a melodic representationas a distribution and a proximity measure, for example thedistribution of pitch class (set of notes used in the melody)and the taxi cab distance [16].

The similarity measure is scaled in the range from 0 to 1,where 1 indicates the highest similarity. This measure can bedetermined in relation to a property of the melody as pitch,rhythm or dynamics. Reason for such, a melody can havehigh rhythmical similarity, but low similarity of the pitchwith other melodies [14]. The measures of similarity used inthis work and their meanings are described as follows:

1) Melodic similarity according to the distribution ofpitch class: it measures the similarity between thenotes that conform the set of pitch class and thedistribution of the number of times that each noteappears in the set.

2) Melodic similarity according to the distribution ofdurations: it measures the similarity of the frequencyof appearance of the durations that compose the musi-cal phrase.

V. EXPERIMENTAL RESULTS

In this section, the experiments and the obtained results aredescribed. The objective is to analyze the effect that has theinclusion of chaotic inspiration notes into the training phaseto generate melodies. A recurrent neural network constituted

Page 5: Ijcnn2011_Andres Rev 07.07.11

by a hidden layer of 20 neurons are used. The BPTT backpropagation through time) algorithm with momentum term isapplied to train the network. The learning rate is 0.1 and themomentum parameter is 0.01. The network is trained usingfour notes of the original melody. In the simulation phase,we also use four input notes, then the network must composethe next four notes. For the chaotic inspiration, we applythe logistic map in the chaotic regime (r = 3.9), varyingthe number of the notes of inspiration from 0 to 20 in thetraining phase.

Figure 4 shows the first compasses of the second move-ment of Vivaldi’s Four Seasons. This melody is used as inputto train the recurrent neural network.

Fig. 4. Compasses of the second movement of the Vivaldi’s Four Seasons

The melody shown by Fig. 5 is composed by usingalgorithm BPTT without notes of chaotic inspiration andusing the representation cycles of thirds.

Fig. 5. Melody composed by algorithm BPTT without notes of chaoticinspiration

The melody shown in Fig. 6 is composed by usingalgorithm BPTT and four notes of chaotic inspiration.

Fig. 6. Melody composed by algorithm BPTT with 4 notes of chaoticinspiration

In Table I, one can see the obtained measures of PitchComplexity (cbmp) and melodic originality (mom) for thetraining melody (Melody T.), for the composed melodywithout chaotic inspiration (Melody(0)), and for the melodyusing one note of chaotic inspiration (Melody(1)). The meansand standard deviations are also shown in the same table.

In Table II, one can see the melodic similarity measurebetween the training melody and the melody composedwith Melody(0) and the melody using one note of chaoticinspiration Melody(1), respectively. The means and standarddeviations are also shown in the same table.

Figure 7 shows the melodic complexity of the melodycomposed with different amounts of chaotic inspiration notes.

MelodiesMelodic measuresCBMP MOM

Melody T. 5.53 9.17Melody (0) 3.873 7.895Melody (1) 4.098 9.772Melody (2) 4.354 8.947Melody (3) 4.577 9.384Melody (4) 3.953 10.00Melody (5) 4.489 9.355Melody (6) 4.431 9.235Melody (7) 4.496 9.374Melody (8) 4.473 9.126Melody (9) 4.625 9.255Melody (10) 5.154 9.633

Mean 4.59 9.278Standard deviation 0.387 0.480

TABLE ICOMPLEXITY OF PITCH (CBMP) AND MELODIC ORIGINALITY (MOM) OF

THE MELODIES COMPOSED WITH ZERO AND ONE INSPIRATION

MelodiesDistribution of pitch Distribution of durations

Melody T. Melody T.Melody T. 1 1Melody (0) 0.74 0.92Melody (1) 0.75 0.78Melody (2) 0.60 0.49Melody (3) 0.52 0.78Melody (4) 0.75 0.80Melody (5) 0.63 0.78

Mean 0.598 0.696Standard deviation 0.09 0.179

TABLE IISIMILARITY MELODIC BETWEEN THE TRAINING MELODY AND THE

MELODY COMPOSED WITH ZERO AND ONE NOTE OF INSPIRATION

The dotted line is the complexity of the melody withoutinspiration notes and the solid line is the complexity ofthe original melody. It can be noted that complexity of themelody composed by the network is increased proportional tothe number of inspiration notes, and it is going to be closer tothe value of the complexity of the original melody. Then themelodic complexity of the output melody is proportional tothe number of the inspiration notes. Nevertheless, the numberof inspiration note increases with the training time, which isan important point to be considered at the time of choosingthe proper number of notes to train the network.

For the case of melodic originality, we have found thatthe originality of the melody composed by the network withchaotic inspiration notes lies close to the value of originalityof the training melody, which means that the number of notesof chaotic inspiration does not affect the originality of theoutput melody, as it is shown in Fig. 8.

Varying the number of the chaotic inspiration notes, itis possible to see that the melodic complexity gets higherwhen the number of notes inspiration is high and additionallythe composed melody is more different with regarding tothe training melody. That is, the melody composed by thenetwork becomes less resemblance to the original one if more

Page 6: Ijcnn2011_Andres Rev 07.07.11

Fig. 7. Melodic complexity of pitch class

Fig. 8. Melodic originality with and without inspirations notes

notes of chaotic inspiration are used. In Fig. 9, the graphicalof melodic similarity according to the distribution of pitchbetween the original melody and each one of the melodiescomposed by the network is presented. In this figure, thenumber of notes of chaotic inspiration is varied from 0 to 20.The dotted line is the similarity measure between the originalmelody and the melody without chaotic inspiration. The solidline is the evolution of the melodic similarity between thetraining melody and each one of the composed melodiesusing different number of chaotic inspiration notes.

Fig. 9. Melodic similarity of pitch class

Figure 10 shows that the rhythmic similarity of the melodywithout chaotic inspiration is high. Moreover, the distance ofsimilarities between the melody without chaotic inspirationand with chaotic inspiration gets larger as the number ofinspiration notes is increased.

Fig. 10. Melodic similarity of durations

A. Melodic Originality and Learning Rate

Figure 11 shows the melodic originality by varying thelearning rate. We see that the melodic originality has fewchanges for different values of the learning rate.

Fig. 11. Melodic originality vs. learning rate

VI. CONCLUSIONS

In this paper, a recurrent neural network trained by BPTTis used for generating melodies. It has been added into theinput layer a small fragment of a chaotic melody, consideredas an inspiration source to attain a big variability of melodies.Several measures are used to quantify the characteristicsof the composed melodies by the neural network. Varioustesting are conducted in order to find the relations betweenthe musical characteristics of the generated melody by theneural network and the quantity of inspiration notes. Thefirst of them is about the inclusion of chaotic inspirationnotes in the training phase. It is noted that it has the effectto change the melodic complexity of the melody composedby the neural network in the application phase. This isbecause the network learns the characteristics of the trainingmelody with the chaotic inspiration. More notes of chaoticinspiration we have, the influence of chaos gets stronger,then one obtains a melody with higher variations, whilepreserving the characteristics of the original melody. Thus,for increasing the differences between the training melodyand the composed melody, it is enough to have a distributionof notes which is more complex, but less similar to theoriginal melody. The second one is related to the using of

Page 7: Ijcnn2011_Andres Rev 07.07.11

neural networks, which is possible to learn the characteristicsand the inheriting patterns of the training melody, besidesof presenting the statistics information adequately of thetraining set. It is shown that variations in the rate of learningdo not affect significantly the melodic originality of thecomposed melody. As future works, we intend to test thenetwork behavior with different types of chaotic dynamicalsystems, continuous and discrete, and with different numberof variables and parameters, aiming to determine whichsystem is most suitable to compose melodies more “creative”without losing the essence of the main melody of training.In addition, we would like to determine the number of notesof inspiration for adjusting the complexity of the melody ofthe network within a high range of variability, and then tolet the number of notes of inspiration fixed to control thecomplexity by varying a parameter of the system.

REFERENCES

[1] P. Todd, A Connectionist Approach to Algorithmic Composition.Computer Music Journal: Vol. 13, No. 4,1989.

[2] J. P. Lewis, Creation by Refinement and the Problem of AlgorithmicMusic Composition. Music and Connectionism, MIT Press: 1991.

[3] M. C. Mozer, Neural network music composition by prediction:Exploring the benefits of psychoacoustic constraints and multiscaleprocessing. Connection Science, 1994.

[4] C. Chen and R. Miikkulainen, Creating Melodies with EvolvingRecurrent Neural Networks. In Proceedings of the InternacionalJoint Conference on Neural Networks, IJCNN 01. p.2241-2246,Washington DC, 2001.

[5] K. Verbeurgt, M. Fayer, and M. Dinolfo, A Hybrid Neural-MarkovApproach for Learning to Compose Music by Example, Canadian AI2004, LNAI 3060, pp. 480484, 2004.

[6] T. Oliwa and M. Wagner, Composing Music with Neural Networksand Probabilistic Finite-State Machines. Lecture Notes in ComputerScience, 2008, Volume 4974/2008, 503-508.

[7] A. Hoover and Stanley K. Exploiting functional relationships inmusical composition. Connection Science Special Issue on Music,Brain and Cognition, Abinngton, UK: Taylor and Francis, Vol. 21,No. 2, 227-251, June 2009.

[8] G. Bickerman, S. Bosley, P. Swire, and R. Keller, Learning to CreateJazz Melodies Using Deep Belief Nets. Proceedings Of The Inter-national Conference On Computational Creativity, Lisbon, Portugal,January 2010

[9] J. A. Franklin, Recurrent Neural Networks for Music Computation.Journal on Computing, Vol. 18, No. 3, Summer 2006, pp. 321-338.

[10] D. Correa, Sistema baseado em redes neurais para composicaomusical asistida por computador, Dissertao mestrado UFScar, 2008.

[11] A. Coca, G. Olivar, L. Zhao, Characterizing Chaotic Melodies in Au-tomatic Music Composition. CHAOS - An Interdisciplinary Journal,Vol. 20, Issue 3, p. 033125, 2010.

[12] S. Haykin, Neural Networks A Comprehensive Foundation, Springer,1999.

[13] I. P. Galvan L., Redes de Neuronas Artificiales Un Enfoque Practico.Pearsons, Educacin, Madrid, 2004.

[14] A. N. T. Eerola, Expectancy-based model of melodic complexity. Pro-ceedings of the Sixth International Conference on Music Perceptionand Cognition ICMP, 2000. CDROM.

[15] H. Schaffrath, The essen folksong collection in kern format. [com-puter database]. D. Huron (ed.), 1995. Menlo Park, CA: Center forComputer Assisted Research in the Humanities.

[16] T. Eerola and P. Toiviainen. MIDI Toolbox: MATLAB Tools forMusic Research. University of Jyvaskyla, Jyvaskyla, Finland, 2004.

[17] L. Hofmann-Engl, An evaluation of melodic similarity models.chameleon group online publication, 2005.