The MTG-Jamendo Dataset for Automatic Music Tagging


Machine Learning for Music Discovery Workshop at the International Conference on Machine Learning (ICML 2019)

Publication Year



  • Dmitry Bogdanov
  • Minz Won
  • Philip Tovstogan
  • Alastair Porter
  • Xavier Serra


We present the MTG-Jamendo Dataset, a new open dataset for music auto-tagging. It is built using music available at Jamendo under Creative Commons licenses and tags provided by content uploaders. The dataset contains over 55,000 full audio tracks with 195 tags from genre, instru- ment, and mood/theme categories. We provide elaborated data splits for researchers and report the performance of a simple baseline approach on five different sets of tags: genre, instrument, mood/theme, top-50, and overall.