Automatic Annotation of Musical Audio for Interactive Applications

Thesis document

This page links to the online version of the PhD dissertation:

Automatic Annotation of Musical Audio for Interactive Applications, Paul M. Brossier
Centre for Digital Music, Queen Mary University of London
under the Direction of Dr. Mark Plumbley and Prof. Mark Sandler
External Examiners: Prof. Eduardo R. Miranda and Dr. Michael Casey
Submitted September 2006, accepted March 2007

download pdf (3.4M)

The final version of the document is available in the Portable Document Format (PDF). The document contains internal and external hyperlinks, and colour graphics, prepared to print correctly on a white and black printer.

The contents of the CD-ROM accompanying the final PhD document is described below, with links to online copies of the files.

Abstract

As machines become more and more portable, and part of our everyday life, it becomes apparent that developing interactive and ubiquitous systems is an important aspect of new music applications created by the research community. We are interested in developing a robust layer for the automatic annotation of audio signals, to be used in various applications, from music search engines to interactive installations, and in various contexts, from embedded devices to audio content servers. We propose adaptations of existing signal processing techniques to a real time context. Amongst these annotation techniques, we concentrate on low and mid-level tasks such as onset detection, pitch tracking, tempo extraction and note modelling. We present a framework to extract these annotations and evaluate the performances of different algorithms.

The first task is to detect onsets and offsets in audio streams within short latencies. The segmentation of audio streams into temporal objects enables various manipulation and analysis of metrical structure. Evaluation of different algorithms and their adaptation to real time are described. We then tackle the problem of fundamental frequency estimation, again trying to reduce both the delay and the computational cost. Different algorithms are implemented for real time and experimented on monophonic recordings and complex signals. Spectral analysis can be used to label the temporal segments; the estimation of higher level descriptions is approached. Techniques for modelling of note objects and localisation of beats are implemented and discussed.

Applications of our framework include live and interactive music installations, and more generally tools for the composers and sound engineers. Speed optimisations may bring a significant improvement to various automated tasks, such as automatic classification and recommendation systems. We describe the design of our software solution, for our research purposes and in view of its integration within other systems.

Sound examples

The examples/ directory contains audio examples illustrating different results obtained with aubio. They are organised as follows:

onset/
Examples of click tracks obtained in real-time using our implementation are available in this directory. Examples obtained with bonk~ [Puckette et al., 1998] are also provided for comparison. The sounds used in Chapter 2 are also included.
tempo/
Beat tracking examples obtained using aubiotrack on different recordings.
notes/
Examples of MIDI notes output obtained using aubionotes on different sounds.

Selected publications

Pages 175-178 and 179-183 include reprints of the following articles:

Paul M. Brossier, Juan-Pablo Bello, and Mark D. Plumbley. Real-time temporal segmentation of note objects in music signals. In Proceedings of the International Computer Music Conference (ICMC), pages 458-461, Miami, Florida, USA, November 2004.

Paul M. Brossier, Juan-Pablo Bello, and Mark D. Plumbley. Fast labelling of notes in music signals. In Proceedings of the International Symposium on Music Information Retrieval (ISMIR), pages 331-336, Barcelona, Spain, October 2004.

Aubio source code and documentation

The aubio/ directory on the CD-ROM contained the version 0.3.1 of the source code for aubio, and the source code documentation, generated using Doxygen.

For other revisions of the aubio library, please see http://aubio.org.

Contact

Questions and feedback would be welcome at piem@piem.org.

A copy of this page is available at http://aubio.org/phd/.

xhtml | css