HEXAPHONIC GUITAR TRANSCRIPTION AND
VISUALISATION
Iñigo Angulo
Music Technology Group
Universitat Pompeu Fabra
[email protected]
Sergio Giraldo
Music Technology Group
Universitat Pompeu Fabra
[email protected]
ABSTRACT
Music representation has been a widely researched topic
through centuries. Transcription of music through the
conventional notation system has dominated the field, for
the best part of the last centuries. However, this notational system often falls short of communicating the essence
of music to the masses, especially to the people with no
music training. Advances in signal processing and computer science over the last few decades have bridged this
gap to an extent, but conveying the meaning of music
remains a challenging research field. Music visualisation
is one such bridge, which we explore in this paper. This
paper presents an approach to visualize guitar performances, transcribing musical events into visual forms. To
achieve this, hexaphonic guitar processing is carried out
(i.e. processing each of the six strings as an independent
monophonic sound source) to get music descriptors,
which reflect the most relevant features of a sound to
define/characterise it. Once this information is obtained,
our goal is to analyse how different mappings to the visual domain can meaningfully/intuitively represent music.
As a final result, a system is proposed to enrich the musical listening experience, by extending the perceived auditory sensations to include visual stimuli.
.
1. INTRODUCTION
Music is one of the most powerful art-expressions.
Through history, humans have shared the musical realm
as part of their culture, with different instruments and
compositional approaches. Often, music can express what
words and images cannot and thus remains a vital part of
our daily life. Advancements in technology over the last
decades have brought us the opportunity to go deeper into
developing an understanding of music, in the context of
other senses such as sight, which dominates over other
senses for representing information. In this work we
propose a system to extend music by developing a visualisation/notation approach to map the most important
features that characterise musical events into the visual
domain.
Copyright: © 2016 I. Angulo, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License
3.0 Unported, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are
credited.
Rafael Ramirez
Music Technology Group
Universitat Pompeu Fabra
[email protected]
The idea of providing mechanisms for understanding
music using our eyes is not new, as traditional music
notation (i.e. scores) may provide us with an idea about
the acoustic content of a piece without the need of previously listening to it. However, our approach is not intended to design a performance instructor, but a visual
extension of the musical events that compose a performance/piece. Our goal is to develop a system that is able
to visually represent the musical features that best characterise the music produced by a guitar, that is, to develop a
real-time visual representation system for guitar performances. One challenge for the development of such a
system is the polyphonic nature of the guitar. The complexity of polyphonic sound transcription is well known,
so to solve this issue we opted to use a hexaphonic guitar,
in which each of the strings is processed as an independent monophonic sound source, simplifying the transcription of the sounds. We propose a way of transforming a
conventional classical guitar into a hexaphonic one. Once
the desired musical features are obtained, different ways
to represent them are studied, analysing the mappings
between sound and visual dimensions.
The aim of this work is to offer a tool in which information about the musical events (e.g. pitch, loudness,
harmony) of a guitar performance is visualized through a
graphical user interface. Additionally, these visualisations
could enrich the experience of listening to music by accompanying musical events with visual stimuli.
2. STATE OF THE ART
A music notation system can be any symbolic notation
that transmits sufficient information for interpreting the
piece [1]. In our case, we wanted to explore the interconnection of music and visual fields, following the concept
of bimodal art expression, the result of joining music and
visuals, in which both dimensions are equally relevant
since they are not perceived separately, but as a whole
[2]. Throughout history, this kind of works has often been
discussed in relation to “synaesthesia”. This is a neurological condition in which a stimulus in one sense causes
an involuntary answer in another one. The disadvantage
about synaesthesia, from a scientific point of view, is its
idiosyncrasy, which means that two synaesthetes with the
same type of synaesthesia will most likely not share the
same synaesthetic experiences [3]. This characteristic
means we cannot develop an objective basis for the interconnection of visual and musical domains based on synaesthetic esperiences.
2.1 Music Visualisation
2.2 Music Visualisation Systems
Music visualisation is a field that has attracted the attention of many researchers, scientists and artist for centuries. Many attempts have been made to create a machine
that joins music and visuals, as for example the “Color
Organ” (or “Light Organ”) created by Louis Bertrand
Castle in the 1730’s. Further examples of early inventions, and the history of this field can be found in [4][2].
As mentioned previously, synaesthesia seemed to play
a very important role in the early works in the field. Despite the subjectivity of synaesthetic cases, many people
have proposed different theories to try to accurately relate
musical and visual domains. An example of this is the
notes´ pitch to colour mapping, a topic that has been
addressed by many researchers in the past, see Figure 1
[4].
At the present, many systems exist whose aim is combining sound and visual dimensions in music, generally
having an artistic approach (“subjective approach”), as
the output often consists of abstract imagery that accompany music. As technology has progressed, so have the
tools that permit the exploration of the relation between
these art modalities. Artist from many disparate fields
dedicate themselves to experiment within this field, and
this activity has led to the development of different practices through time. One example could be Video Jockeys
(VJ´s), who perform by mixing videos and applying effects to them, normally in the context of some background music. Furthermore, concepts such as Visual
Music or Audiovisual Composition appear together with
these practices [2].
One example of a Music Visualisation system is Soma
[2]. It was created in 2011 by Illias Bergstrom, for the
live performance of procedural visual music/audiovisual
art. The author wanted to improve the established practice of this field ́s art form and break through it ́s main
limitations, which included constrained static mappings
between visuals and music, the lack of a user interface for
controlling the performance, the limitation of collaborative performances between artists, and the complexity of
preparing and improvising in performances. With those
ideas in mind, he proposed a system, both in hardware
and software, to address these limitations. Figure 2 reflects the architecture of this system.
Figure 1. Pitch to colour mappings through history.
Nowadays, the music visualisation field is very active.
The idea of joining sound and visuals to enrich the experience of music continues to attract the attention of many
researchers, and there are different theories that support
this hypothesis. An example of this is Michel Chion ́s
widely established concept of synchresis: “(...) the spontaneous and irresistible weld produced between a particular auditory phenomenon and visual phenomenon when
they occur at the same time”. This theory, which asserts
that synchronised auditory and visual events provides
“added value”, is vital knowledge for sound for cinema.
These theories, and especially synaesthesia contributed to
the argument that there is a strong indication that multimodality perceptions are not processed as separate
streams of information, but are fused on the brain into a
single percept [2].
When talking about music visualisation, many different
approaches are comprised, which lead to different purposes. These may include, the simple representation of a
waveform or a spectrum to visualize a signal into the
frequency domain; the transcription of sound as accurate
as possible, using scores or other notational system (“objective approach”); and the artistic visualisation of sound,
which aim to create beautiful imagery to accompany
music and create a sensory answer in the listener/viewer
(“subjective approach”). In the next sections some examples of the different approaches are presented.
Figure 2. Illustration of the signal flow in Soma system.
Another interesting example of a music visuals system
is Magic [5]. Magic is an application that allows one to
create dynamic visuals that evolve from audio inputs. It is
though as a tool for VJing, music visualisation, live video
mixing, music video creation, etc. It allows one to work
with simultaneous audio and MIDI inputs, both prerecorded tracks and live input signals, computing different features in each case, for example: the overall amplitude of the signal, the amplitude per band, pitch, brightness; or MIDI features such as velocity, pitch bend or
channel pressure. Figure 3 shows the graphical user interface of the software. The boxes are modules which have
different functionalities and are interconnected to create
scenes.
Figure 3. Magic Music Visuals graphical user interface.
The previously presented systems (Soma and Magic)
are examples of systems that permit the visualisation of
music, by creating mappings between sound and visual
features. Thus, the created visualisations are directly
controlled by the “changing-along-time” musical features
extracted from the audio signals. These features, such as
pitch, loudness or rhythm, are normally computed using
signal processing techniques.
However, many of this kind of systems are more generic as they are designed for controlling the visuals from a
group of DMI (Digital Musical Instruments) controllers,
or directly from a stereo audio signal representing the
mix of all the instruments that compose the piece. We
propose a system specifically designed for the guitar.
2.3 Music Visualisation Systems based on Guitar
The systems analysed in this section approximate to the
“objective approach” domain, as their visualisation aim
consists of the transcription of music so that it can be
reinterpreted. In other words, these visualise music from
a transcription/notation perspective, and entail a performance instruction rather than a “creative visualisation” of
the music that is being played.
One of the possible reasons that contributed to the popularity of these systems was the release of Guitar Hero
[6], a videogame that appeared in 2005, which aimed to
recreate the experience of playing music and make it
available to everyone as a game. It consisted of a DMI
guitar-shaped controller through which music was “interpreted” by the player, who was guided by the instructions
that appeared on the screen. These instructions consisted
of the notes that compose a particular song, presented
over time. So, with the original song´s backing track
sounding, the aim of the player was to press buttons on
the guitar controller in time with musical notes that scroll
on the game screen.
Another game called Rocksmith [7] was released in
2011. This game followed the main idea of music performance instruction (as Guitar Hero), but with an essential difference: a real guitar was used instead of a DMI
controller. The idea behind the game was to be able to
use any electric guitar, so it was approached as a method
of learning guitar playing. The game offered a set of
songs, for each of which a performance instruction was
presented based on the notes that had to be played along
time. Then, some feedback about the performance quality
was given to the user.
There are other systems that follow the same approach
of guitar music visualisation as notation, to give the user
the necessary instructions to reinterpret a particular piece.
Some examples of this are GuitarBots and Yousician [8].
These systems provide an easier way of learning to play
guitar by helping the user with instructions about what to
play.
The system we propose in this paper lie into this last
category of “objective” music visualisation system,
which aims to transcribe music accurately. Our goal is to
provide a guitar performance representation tool, transcribing sounds to visual forms instead of traditional
notational systems (i.e. scores), which could result more
meaningful for people with no musical training. This
system is able to reproduce musical events in real-time,
creating thus visual stimuli for both the listener/viewer
and the musician. However, although our first intention is
to accurately reproduce guitar performances, we also
want to explore the artistic representation domain (once
we have enough sound and visual features to be mapped)
by creating more abstract/impressive visualisations, aiming to analyse if this leads to a stronger sensorial perception of music by the user of the system.
3. MATERIALS
3.1 Essentia
Essentia [9] is an open-source C++ library for audio
analysis and audio-based music information retrieval.
The library is focused on the robustness of the music
descriptors it offers, as well as on the optimisation in
terms of the computational cost of the algorithms. Essentia offers a broad collection of algorithms, which compute a variety of low-level, mid-level and high-level
descriptors useful in the music information retrieval
(MIR) field. This library is cross-platform and provides a
powerful tool that collects many state-of-the-art techniques for the extraction of music descriptors and optimized for fast computations on large collections.
3.2 Processing
Processing [10] is a programming language and environment targeted at artists and designers. It has many different uses, from scientific data visualisation to artistic ex-
plorations. It is based on the Java programming language.
In the context of music visualisation, Processing has
available several libraries to deal with audio, such as
Minim, which lets one work with different audio formats
and perform many different signal-processing techniques
to obtain musical features. The visualisations can be
controlled with the results of computing these features,
and range from raw frame data that correspond to visualizing data as precisely as possible to the perceived music,
(for example drawing the waveform of a frame), to generative visualisations that focus on producing beautiful
images and impressive effects.
3.3 Hardware
This research project also focuses on the hardware implementation of the hexaphonic guitar. Our approach is
based on transduction using piezoelectric sensors. These
sensors work on the piezoelectric principle, which states
that electric charge is accumulated in certain solid materials in response to applied mechanical stress. Thereby,
one of these sensors is able to transform the mechanical
force that is exerted on it by the vibration of the string,
into an electric current representing this vibration. In
addition to these sensors, other materials such as wires to
build the circuits and 1⁄4” TS (Jack) connectors to be
plugged into the computer ́s audio input device are needed.
4. METHODS: DEVELOPMENT
4.1 Hexaphonic Guitar Construction
As explained earlier, some hardware is needed to transform a traditional guitar into a hexaphonic one. It consists
of six piezoelectric sensors, six 1⁄4” TS jack connectors
and twelve wires to interconnect them. With this material
a circuit was built to capture the signal from each string
independently and send it through a cable to the computer ́s audio input device, via a common audio interface
with six channels.
Figure 4. Hexaphonic guitar construction scheme using
piezoelectric sensors.
The construction is shown in Figure 4. Each piezoelectric sensor is welded with a jack connector as shown in
the scheme. The tip of the jack (the shortest part in the
end) is connected to the white inner circle of the piezoelectric, and will transmit the signal. The sleeve of the jack
is thus connected to the golden surface of the piezoelectric. Each of the sensors is cut and placed between the
string and the wood of the bridge of the guitar, which is
where we found the vibration of the string was best captured by the sensor. Once this is done, for each of the
strings, the output jack connectors are plugged into different channels of the audio interface, so that the signals
can be independently processed. These sensors act like
small microphones capturing the sound produced by each
of the strings.
4.2 Audio Signal Processing
Once we had our six separate audio signals, corresponding to each of the strings, we processed them using the
Essentia library to obtain meaningful musical features.
For our purpose, several descriptors were used to extract
the desired features from the sound, such as
PitchYinFFT, Loudness and HPCP, which extract information about the pitch, energy and chroma of the notes.
These were computed in real-time and sent to Processing,
where they were used to control the visualisations.
In addition to this, a map of frequencies was created,
corresponding to the frequencies of the notes along the
fretboard of the guitar, in standard tuning. Hence, obtaining the fundamental frequency of the note played on each
of the strings (which is easy as the signal for each string
comes on an separate input channel into the computer),
tells us exactly which frets (and on which strings) were
being played at a given moment.
4.3 Visualisation
To perform the visualisation of the musical features previously extracted, we used Processing. Using this software, a graphical interface was created to visualize the
musical features using different visual forms (Figure 5).
At present, we have developed a simple visualisation to
test the system. The amount of musical features involved,
as well as the quality of the mappings and visualisation
approach will be revised and enhanced in the future.
We presented the information in a 2D-plane, in which
the X-axis represented the six different strings and the Yaxis the pitch height. The notes were represented using
circles. From left to right (on the X-axis) strings were
visualized as vertical lines from the 6th to the 1st one. For
example, when playing any note on the 4th string, it will
be always represented on the same “invisible” vertical
line, which crosses the X-axis at a particular point (corresponding to that 4th string).
The loudness of each note was mapped to the size of
the circle, which becomes smaller as the note decays.
Also, having built the frequency map, it was easy to
know which particular note was played, so the name of
the corresponding note is plotted in the center of the circle. To easily distinguish one note from others, these
were mapped to the range of visible colours. The lowest
frequency on the guitar (E in standard tuning) is mapped
to the lowest frequency in the visible range (red). The
reason for this was more aesthetical than scientific. Figure 6 shows an approximation to this mapping.
indicate the complexity he/she found when doing the first
two experiments.
Figure 7. Chord progression 1 score.
Figure 8. Chord progression 2 score.
Figure 9. Melody score.
Figure 5. Example of visualisation interface.
Figure 10. Arpeggio score.
Figure 6. Note to colour mapping.
5. EVALUATION
5.1 Experiments
As this research project is still a work in progress, we
prepared a simple evaluation based on some basic guitar
“riffs/phrases” visualisations. We focused on four different guitar phrases: two different chord progressions, a
melody, an arpeggio, and a solo. The phrases were played
in the same key, in order to produce similar visualisations
(same colours, localisation of notes, etc).
We proposed three different experiments to the users.
In the first one, one of the two different chord progression recordings (Figure 7 and Figure 8) was presented to
the user, and then the visualisations of the two chord
progressions were shown in silence. The user had to
choose the visualisation that matched the audio recording.
The second experiment was the opposite, given one
visualisation (presented in silence), the user had to select
from two recordings the one that matched that visualisation, in this case, using fragments of the solo and melody
(Figure 9) phrases. In addition, the user was asked to
The last test consisted of listening to all the phrases
(sequentially, presented as a song) together with their
corresponding visualisations, and afterwards, answering
some questions to rate the system.
The questions evaluated the system in terms of:
• mapping quality and meaningfulness,
• expressiveness, subjectively evaluated by the
user considering if the visualisations led to a
stronger experience of music (multimodal perception),
• interest, if the system was considered interesting/promising by the user
• utility, in which context would a system like this
one be used by the user.
The answers consisted of a score from 1 to 5 to express
agreement, disagreement or neutrality, in addition to a
text box in which the users could write their opinion,
suggestions, or ideas for improvements.
5.2 Results
The experiment was conducted with 20 participants
whose ages ranged from 21 to 55. They had different
backgrounds and musical training. Besides, their musical
taste was varied, as well as the frequency with which they
went to concerts and listened to music. Table 1 shows the
summary of the results of the experiments.
Test 1
Test 2
Correct answers
80%
75%
Difficulty (1-5)
2.9
3.2
Table 1. Experiment results.
80% of the users were able to identify the correct answer to the first experiment, with a difficulty of 2.9 (the
mean of the 1 to 5 range, where 1 was easy and 5 difficult); and 75% of the users answered correctly to the
second experiment. In particular, participants with musical education and/or guitar players found the task easy,
were able to distinguish between the three visualisations,
and even imagine how the music would sound before
listening to it.
Mapping quality/meaningfulness
Expressiveness
Interest
Score (1-5)
4.3
4.2
4.8
Table 2. System valoration.
Table 2 shows the users valoration of the system in
term of mapping quality and meaningfulness, expresiveness and interest. The score ranges from 1 to 5, in which
1 means disagreement and 5 is strong agreement. Several
comments were made about the mappings. Most users
found intuitive the proposed connections between the
sound and visual domains, but many of them argued
about the use of colour to identify notes. Also, most of
the users liked the experience of simultaneous music and
visuals, but some of them said the visualisations were
very basic, and suggested that developing more “artistic”
visualisations would work better and transmit more sensations.
All the users found the system very interesting, and
suggested different contexts in which it could be used.
Most of them proposed using the system in live music
performances and concerts, to reinforce the emotions a
particular music piece tries to evoke in the listeners; some
participants suggested that the system could be used as a
didactic tool to help people learn playing guitar, and
musical concepts in general. Moreover, some participants
said “as a tool to emphasize sensorial experiences for
infants in primary education and give support in art classes”, or even “for helping disabled people (i.e. people
with hearing problems) to perceive and experience music”.
6. CONCLUSIONS
Music is one of the most powerful art-expressions, and
advancements in technology have opened new paths
towards its exploration. Nowadays, many researchers
focus their work on accurately representing music, but
without forgetting its most emotive dimension of evoking
sensations in ourselves. Our interest resides in guitar
music representation to offer, through the system described in this paper, a way to visually experience it.
Throughout the experiments that were carried out, we
noticed people found the system interesting and promis-
ing in many different contexts, and they liked the experience of simultaneous music and visuals stimuli. This
research project is a work in progress. However, the process of making an evaluation and obtaining some feedback from the users was really useful to get ideas to improve the system, and also to demonstrate the attractiveness of this system for the public.
The next step we will take is to study how more musical features could contribute to a more useful system for
the musician (as more detailed and complete information
about musical events would be included), as well as making it more attractive/captivating for the audience, by the
design of new approaches to visualizing the artistic dimensions of music.
Acknowledgements. This work has been partly sponsored by the Spanish TIN TIMUL project (TIN201348152-C2-2-R) and the H2020-ICT-688268 TELMI
project.
7.
REFERENCES
[1] A. P. Klapuri, “Automatic Music Transcription as We
Know it Today,” J. New Music Res., vol. 33, no. 3,
pp. 269–282, 2004.
[2] I. Bergstrom, “Soma: live performance where
congruent musical, visual, and proprioceptive stimuli
fuse to form a combined aesthetic narrative,” 2011.
[3] R. Ivry, “A Review of Synesthesia,” Univ. California,
Berkeley, 2000.
[4] M. N. Bain, “Real Time Music Visualization: A Study
In The Visual Extension Of Music,” The Ohio State
University, 2008.
[5] “Magic Music Visuals: VJ Software, Music
Visualizer & Beyond.” [Online]. Available:
https://magicmusicvisuals.com/. [Accessed: 14-Jun2015].
[6] “Guitar Hero Live Home | Official Site of Guitar
Hero.”
[Online].
Available:
https://www.guitarhero.com/. [Accessed: 01-Nov2015].
[7] “Rocksmith® 2014 | Página oficial de España |
Ubisoft®.”
[Online].
Available:
http://rocksmith.ubi.com/rocksmith/es-es/home/.
[Accessed: 01-Nov-2015].
[8] “Yousician.”
[Online].
Available:
https://get.yousician.com/. [Accessed: 01-Nov-2015].
[9] D. Bogdanov, N. Wack, E. Gómez, S. Gulati, P.
Herrera, O. Mayor, G. Roma, J. Salamon, J. Zapata,
and X. Serra, “ESSENTIA: an open-source library for
sound and music analysis,” Proc. ACM SIGMM Int.
Conf. Multimed., 2013.
[10] C. Pramerdorfer, “An Introduction to Processing and
Music Visualization.”