Impact of Frame Size and Instrumentation on Chroma-Based Automatic Chord Recognition


Data Science, Learning by Latent Structures, and Knowledge Discovery, pp. 411–421

Publication Year

  • ISBN: 978-3-662-44983-7


  • Daniel Stoller
  • Matthias Mauch
  • Igor Vatolkin
  • Claus Weihs


This paper presents a comparative study of classification performance in automatic audio chord recognition based on three chroma feature implementations, with the aim of distinguishing effects of frame size, instrumentation, and choice of chroma feature. Until recently, research in automatic chord recognition has focused on the development of complete systems. While results have remarkably improved, the understanding of the error sources remains lacking. In order to isolate sources of chord recognition error, we create a corpus of artificial instrument mixtures and investigate (a) the influence of different chroma frame sizes and (b) the impact of instrumentation and pitch height. We show that recognition performance is significantly affected not only by the method used, but also by the nature of the audio input. We compare these results to those obtained from a corpus of more than 200 real-world pop songs from The Beatles and other artists for the case in which chord boundaries are known in advance.

Source Materials