Breathy, Resonant, Pressed – Automatic Detection of Phonation Mode from Audio Recordings of Singing

Venue

Publication Year

Identifiers

DOI: http://dx.doi.org/10.1080/09298215.2013.821496

Authors

Polina Proutskova
Christophe Rhodes
Tim Crawford
Geraint Wiggins

Abstract

Abstract In this paper we present an experiment on automatic detection of phonation modes from recordings of sustained sung vowels. We created an open dataset specifically for this experiment, containing recordings of nine vowels from multiple languages, sung by a female singer on all pitches in her vocal range in phonation modes breathy, neutral, flow (resonant) and pressed. The dataset is available under a Creative Commons license at http://www.proutskova.de/phonation-modes. First, glottal flow waveform is estimated via inverse filtering (IAIF) from audio recordings. Then six parameters of the glottal flow waveform are calculated. A 4-class Support Vector Machine classifier is constructed to separate these features into phonation mode classes. We automated the IAIF approach by computing the values of the input arguments – lip radiation and formant count – leading to the best-performing SVM classifiers (average classification accuracy over 60%), yielding a physical model for the articulation of the vowels. We examine the steps needed to generalize and extend the experimental work presented in this paper in order to apply this method in ethnomusicological investigations.

Source Materials

BibTeX Citation