Composing Music On Paper and Computers: Musical Gesture Recognition

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Composing Music on Paper and Computers: Musical

Gesture Recognition
Bobby Owolabi
Department of Computer Science
Human Computer Interaction Lab
University of Maryland
College Park, MD 20742
[email protected]

ABSTRACT
Paper is preferred and utilized far more than
computers in the composer’s music creation cycle
because it is the natural medium in which music
notation convention is learned. Current music notation
software utilizes only WIMP interfaces (Windows,
Icon, Menus and point and click). Our system enables
Figure 1: Music created on anoto paper with a
users to create musical compositions by utilizing
digital pen.
digital pen technology and having their work captured
and recognized into music notation in the digital world.
This recognized pen gesture data has the potential of
being imported into a popular musical notation
composition program for editing purposes. Figure 2: Written music as it would appear
entered in Finale software [2].
INTRODUCTION
Despite technological breakthroughs in computing interfaces are not serving composers well in their
technology in the last few decades, paper is being used composition activities because they bear little
increasingly [8]. This is especially true in artistic and resemblance to the handwritten techniques learned by
creative disciplines, such as music, where paper tends many musicians [3]. This shortcoming can cause a
to be utilized a great deal. This state of affair exists hindrance to the music creation cycle.
because of the complementary set of affordances that Software, such as The Music Notepad [3], designed
paper and digital documents provide [4]. Paper is for the tablet PC, is a move towards a more natural
portable, flexible and inexpensive [8]; while, digital interface for composers because of the familiarity of its
documents are easy to transmit, store and modify. For pen based input; however, affordances offered by
instance, when a composer is in the early stages of paper, such as flexibility and portability are often lost.
creating a musical piece, it is far easier to take out a Our system attempts to take the affordances offered
sheet of paper on the spark of inspiration, rather than a by both paper and digital documents and unify them
technological device, because one can express into one system. Users will be able to create traditional
themselves on paper quicker than waiting for a device handwritten music and have their creations captured
to load. At the same, there are reasons a composer and recognized into music notation in the digital world.
would want to use a computer such as instant playback Our system will better support the musician’s music
or the ability to quickly transpose a piece to a new key. creation cycle by reducing the time composers spend
Using the traditional written approach as well as synchronizing their works in both the physical and
computer software in conjunction often presents digital realms.
challenges.
Software applications such as Finale and Sibelieus RELATED WORK
[2][9] enable users to create the same musical notation Investigations have been performed to address many
they would on paper in the digital realm; however, of the difficulties that composers experience when
WIMP (Windows, Icon, Menus and point and click) utilizing today’s musical notation software and also
understand the interactions composers engage in with
paper and computers. Three categories of work that are
relevant to our system include: The Role of Paper and
Computers, Tablet PC Interfaces and Digital Pen
Technology.
The Role of Paper and Computers
In trying to develop an interface that will support
composers in their activities, it is important to
understand how composers create music and their
interactions with computers and paper.
In the Paperroles Project [6], the music creation
cycle of composers was investigated. The description
provided by the Paperroles Project leans more towards
classical composers and culture; however, it provides
an overview of the interactions with paper and
computers by people in the musical domain and can be
summarized into three basic parts.
In the beginning, composers prefer to work on paper Figure 3: Main gestures in the Presto system,
because of the freedom of expression that it offers. reproduced from original text [5]
They are not bound by the limitations of a software
program such as slow input, hardware loading time and
portability issues such as weight and size of hardware. The researchers of the Presto system performed an
In the middle of the cycle, there is a mixture of paper investigation of users’ hand-written music composition
and computer use. Computers provide composers with habits and styles and developed a proposal to replace
ease of modification. Often, composers experience standard music notation with simplified versions to
difficulties during the middle stage because they are make gesture recognition easier and more accurate.
slowed down by slow user input speed. The collection of gestures is presented in a way that
In the end, most composers in the classical genre encourages building a note rather than providing
prefer paper for archival purposes. gestures that are directly mapped to specific music
The Paperroles Project offers a design scenario that gestures (Figure 3).
proposes actually taking the handwritten document of a The Music Notepad is an application that supports
user, utilizing digital pen technology to recognize the music notation entry and also replaces the standard
written gestures and provide the ability for the musical music notation with simplified versions that are
notation to be opened up in an application in digital different from the Presto system. The Music Notepad
form for further editing. Our system aims to system offers a larger set of gestures that map to each
demonstrate the recognizing framework of this music notation object (Figure 4). The Music Notepad
interface. also utilizes a special stylus pen that has buttons that
correspond to different classes of gestures.
Tablet PC Based Interfaces The Presto system offers a smaller table of gestures
A major trend that is currently being investigated is for the user to learn while allowing multiple ways to
pen based interfaces over traditional mouse interfaces. draw gestures. This characteristic would most likely
Such examples include Presto [5] and The Music enable the user to adapt to the system quicker than the
Notepad [3]. Music Notepad. However some of Presto’s gestures do
There are a lot of complexities in detecting and not resemble the gestures they are mapped to; for
recognizing musical gestures because the vast array of instance, the half note (Figure 3). Unfamiliar gestures
possible handwriting styles of users and interpretations increase the learning curve of the system. The Music
of music notation convention. Proposed systems have Notepad’s gestures more closely resemble the gestures
offered simplified gestures that correspond to music they are mapped to and can be drawn in one stroke for
notation objects. With these mappings, designers wish the most part; whereas, Presto’s gestures may take
to offer a quick learnable interface to users, while multiple strokes to complete; consequently resulting in
being able to more accurately detect the true intention slower gesture creation.
of a user’s gestures. The Presto and Music Notepad
[5][3] systems have taken this route.
should be played, called articulations. During this stage
of system development, our main focus will be on the
detection of notes.
Staff
Music is written on a staff. A staff (Figure 5)
consists of a series of horizontal lines, most commonly
five, that are stacked, equally spaced, on top of each
other that run across a page.

Figure 5: A standard 5 line staff with treble clef


symbol for pitch orientation purposes.

Notes
Notes consist of four fundamental parts (Figure 6,
Appendix A for examples): (1) Head, which is an
empty or filled circle, its position on the staff specifies
pitch. (2) Vertical beam, which is connected to the
Figure 4: Gestures from the Music Notepad head. (3) Stem, which intersects vertical beams and
reproduced from original text [3]. Gestures on the left adds meaning to the duration of a note. (4) Horizontal
are drawn by the user. Gestures on the right are beams are extended stems that connect several notes.
resulting gestures drawn by the computer.

The system we are proposing will directly recognize Horizontal


standard music notation rather than providing a Vertical Beam
Stem
mapping of simplified gestures to music notation Beam
objects.
Digital Pen Technology
In moving towards pen based interfaces for music
notation software, the emergence of digital pen
technology has opened up exciting possible
applications for software interfaces. Such work
includes Paper Augmented Digital Document (PADD)
[4]. A PADD is a digital document that can exist and
be manipulated in both the digital and physical realms.
Head
The framework of our system is built off this concept.

MUSIC BACKGROUND Figure 6: Parts of notes.


Written music notation has gestures that convey both
pitch and rhythm called notes. There are gestures that
convey silence over time, called rests. Lastly, there are
gestures that provide information about how a note
Paper Domain Digital Domain

Recognizer MusicSheet
User Gesture MusicNotation

Figure 7: Overview of detection process: The user makes a gesture with the digital pen, once the pen is lifted off
the paper, it is sent to the Recognizer to get fundamental shape (ex: line or circle). It is then sent to the MusicSheet
object to get its location on the sheet as well as its spatial relationship with the other gestures on the page. Based
on that information, it is recognized as a MusicNotation object and stored.

Articulations small dots that enables the pen to know the exact
Articulations are symbols drawn around notes that paper, its location on it; thus, enabling it to record user
provide information about how the note should be gestures.
played. For instance, an articulation may express that a
System Components
series of notes should be played smoothly and
Once a user draws a gesture, it goes through two
connected without minimal space of rest between
stages: the initial recognition and spatial analysis on
them. Where as another articulation may express that a
the MusicSheet object (Figure 7).
series of notes should be play very short and fast with
The main elements of the systems are:
more rest space between them (For example, see accent
• MusicSheet - A MusicSheet object is a digital
in Appendix A).
representation of a physical sheet of music. It
is created based on key parameters of the
SYSTEM
physical sheet music such as the y-coordinate
Our system enables users to write musical
values of the staff line and the distance
compositions utilizing digital pen and paper
between staff lines. It consists of a series Staff
technology and have their work captured and
objects. Staff objects consist of a group of five
recognized in the digital world with the potential of
horizontal lines. Each Staff object contains
importing the data into a popular musical notion
gestures that the user has drawn on it. A
composition program.
MusicSheet object allows for ease for common
There are three main components of our system: (1)
queries such as finding which staff a user’s
Preprinted Paper, (2) Digital Pen and (3) a
gesture falls on. In order to query the pitch of a
Recognizer.
given note, the MusicSheet object can query its
Preprinted Paper Staff objects and discover the pitch of the note,
In our system, we preprinted staves on paper and based on where it falls on the lines.
stored the location of the staves of in our digital model • MusicNotation – Represents one of the
of the page. The staves were printed on Anoto [1] standard musical symbols (Appendix A).
paper. This special paper, as described below, enables Recognition is determined by the spatial
the digital pen to record the user’s strokes. relationships of fundamental shapes that the
Digital Pen user draws to build a musical gesture.
The digital pen has the properties of a regular pen; • Recognizer – Detects fundamental shapes that
however, it has a small camera at the bottom that the user draws in one pen stroke. Its main
records the coordinates of the user’s gesture. The component is an implementation of the $1
coordinates can either be stored in the pen’s memory Recognizer [10]. The $1 Recognizer consists
and sent to the computer at a later time or streamed in of a four step algorithm where a candidate user
real-time with Bluetooth technology. The digital pen is gesture is compared with pre-defined templates
able to know its location on the sheet of paper and of various expected gestures.
record the coordinate points of user drawn gestures
because of special properties of the paper. Every sheet
of anoto paper has a unique id number and pattern of
User Drawn Gesture User Drawn Gesture

Bounding Box
Bounding Box
Figure 8: Example of how gestures are linked together Figure 9: Example of a situation where a user draws a
to form one object. If a user draws a gestures that gesture that intersects more than one previously drawn
intersects or is reasonably close to a bounding box, gesture that results in the formation of a beamed note.
then the gestures are merged together to form a new
object.

Recognizing fundamental shapes Each object that is created by the user has a
Written music notation has gestures that convey both bounding box. The recognition state of the bounding
pitch and rhythm called notes. It was found that notes box is governed by an automaton shown Figure 10.
could be represented with 3 fundamental gestures: As a user draws gestures that intersect previously
filled circle, empty circle and line segments with drawn gestures, which are determined by querying the
various orientations. MusicSheet object, the group of gestures becomes one
Our system requires that the user draws the object (Figure 8). A new object and automata is started
fundamental gestures of music notation, circles and when the user draws a gesture that does not intersect
lines, one at a time. This means after each fundamental another gesture.
gesture that the user draws, the user must pick up their The MusicSheet can also be queried to find
pen from the page. Given this constraint, we are able to information such as the pitch of the note.
detect single notes through parsing the sequence of
Detecting Beamed Notes
fundamental gestures that intersect other previously
Our approach for detecting beamed notes was to
drawn gestures.
identify key scenarios where a beamed note is likely to
There are situations where the result of the
occur. For example, if a user draws a gesture that
Recognizer is not sufficient and semantic information
intersects more than one previously drawn gesture,
of the gesture, such as its relation to other gestures
then there is a high chance that the user drew or is
must be taken into account. For instance, for detecting
going to draw a beamed note (Figure 9). Further
filled circles, it is not practical to create templates and
analysis, such as looking at the recognition status of
expect that the points of a template will match up with
the intersected gestures, can verify the recognition.
a user candidate filled circle. Therefore, it is necessary
to do secondary detection checks to see if the candidate Detecting Articulations
gesture is a filled circle. Things such as the number of Detecting articulations currently relies primarily on
points that are on the border of a gesture versus the creating articulation templates for the $1 Recognizer
number of points that are contained within the gesture [10] that is utilized in the system. The system detects
can be investigated. The ratio of the length and height two articulations: ties and accents (Appendix A).
of the gesture can also be taken into account to Currently, detected articulations are not linked to
determine if it is a filled circle. notes; however, as the system is developed, spatial
information of the articulations will be used connect
Detecting Single Notes
articulations with their corresponding notes.
For our purposes, single notes are defined as notes
that have only one head; or more specifically, notes
EDITING
that are not beamed together with other notes. They
If a user wishes to delete a note, they can draw a
include whole, half, quarter, eighth, sixteenth and
“scribble” on top of the note and it will be deleted from
thirty-second notes (See appendix A).
memory.
Figure 10: Parser for Single Notes. As a user draws a gesture that does not intersect another gesture, the
parser for that object starts at the “START” state. As the user draw lines and circles that intersect the
given gesture, the state for the object is updated according to this diagram.

FUTURE WORK then manually transfer manually to software. The


The system is still in its early stages of development. Presto system [5] offers some figures on time
We would like to study how various composers improvements over traditional methods; it would be
actually write compositions and investigate ways to interesting to see how ours compares.
enable our system for more “natural” usage. Currently, Lastly, we would like to create a plug-in for an
the system puts the constraint on the user that they application such as Finale, and import recognized pen
must lift up the pen after creating key gestures such as data into the program.
lines and circle. We would like to look at methods that
would focus more on the end result of a user’s gesture CONCLUSION
rather than recognizing the steps a user takes to create The system presented in this paper demonstrates the
the gesture. This idea is demonstrated in Simu-Sketch recognition portion of the design scenario mentioned in
[7]. the Paperroles Project [6]. It attempts to demonstrate
With regards to modification of notes, we would like an interface that support a more natural music creation
to investigate with users what works well and what cycle of composers. We described previous related
does not. What is practical and does it work well with work, an overview of the system and future work.
the music creation cycle? The interface of our system better supports
We would like to also investigate the performance of composers in their activities because it provides a more
this system compared with traditional methods and natural interface to music entry and moves away from
systems such as Presto [5]. What is the speed mouse-point-click interfaces.
improvement of this system over the traditional pen
and paper method, which is to compose music on paper
ACKNOWLEDGMENTS 8. Sellen, A.J. and R.H.R. Harper, The Myth of the
The author would like to thank Dr. Dr. François Paperless Office. 1st ed. 2001: MIT press.
Guimbretière for his support throughout the project.
Hyunyoung Song, Nick Chen and Chunyuan Liao for 9. Sibelius, http://www.sibelius.com/.
their generosity in time and help. Dave Levin for his
advice, time and offered experience. This work was 10. Wobbrock, J.O., Wilson, A.D. and Li, Y. (2007)
supported in part by a REU for grant IIS #0447703 and Gestures without libraries, toolkits or training: A
by the Louis Stokes Alliance for Minority Participation $1 recognizer for user interface prototypes.
Undergraduate Research Program. Proceedings of the ACM Symposium on User
Interface Software and Technology (UIST '07).
REFERENCES Newport, Rhode Island (October 7-10, 2007). New
York: ACM Press, pp. 159-168.
1. Anoto AB, Anoto Technology.
http://www.anoto.com

2. Finale, Coda Music Technology,


http://www.codamusic.com/.

3. Forsberg, Andrew, Mark Dieterich, and Robert


Zeleznik. "The Music Notepad", Proceedings of
UIST '98, ACM SIGGRAPH.

4. Guimbretière, François, Paper augmented digital


documents, Proceedings of the 16th annual ACM
symposium on User interface software and
technology, p.51-60, November 02-05, 2003,
Vancouver, Canada

5. J Anstice, T Bell, A Cockburn and M Setchell. The


Design of a Pen-Based Musical Input System.
OzCHI'96: The Sixth Australian Conference on
Computer-Human Interaction. Hamilton, New
Zealand. 24-27 November, 1996. pages 260-267.
IEEE Press.

6. Letondal, C. and Mackay, W. (2007) Paperoles:


The Paperoles Project: An analysis of paper use by
music composers. In Proceedings of CoPADD,
Collaborating over Paper and Digital Documents,
London, U.K.

7. Levent Burak Kara , Thomas F. Stahovich,


Hierarchical parsing and recognition of hand-
sketched diagrams, Proceedings of the 17th annual
ACM symposium on User interface software and
technology, October 24-27, 2004, Santa Fe, NM,
USA
Appendix A:
Recognized Musical Notes

Whole Note Beamed Eighth Note

Half Note Beamed Sixteenth Note

Beamed Thirty-Second
Quarter Note
Note

Eighth Note Mixed Beamed Note

Quarter Rest
Sixteenth Note

Thirty-Second Note Tie

Accent

You might also like