metaMER
  • Home
  1. Common datasets
  • Home
  • Plan
  • Analysis
    • Search Syntax
    • Extraction Details
    • Pass 3 Comparison
    • Library Parser
    • Analysis
  • Manuscript
  • Common datasets
  • Features

Datasets

The most frequently used 3 datasets are MediaEval (Soleymani et al., 2013), DEAM (Aljanaki et al., 2017), and AMG1608 (Chen et al., 2015). These datasets represent Western pop music, are moderate in terms of the size (containing from 744 to 1802 music excerpts) and have been manually annotated by relative large number of participants (either by experts, students, or crowdsourced workers). Two of the most popular datasets offer a large number (260 to 6669) features extracted with OpenSMILE (Eyben et al., 2010). Looking at the datasets more broadly, the diversity in the size and the features of the datasets is notable. Only two feature extraction tools are used across multiple datasets (OpenSMILE Eyben et al. (2010) and MIR Toolbox, Lartillot & Toiviainen (2007)). However, despite this diversity, there does not seem to be a direct link between the model success rates and the features themselves, or at least separating the features from variation created by the dataset size, annotation accuracy and genre is not possible.

Dataset Stim. Type Stim. Dur. (s) Stim. N Feature N Ppt. N Feature Source In studies
MediaEval Western pop 45 744 6669 10/track OpenSMILE Bai et al. (2016), Bai et al. (2017), Yang (2021), Chin et al. (2018), Coutinho & Schuller (2017), Markov & Matsui (2014), Medina et al. (2020), Wang, Wang, et al. (2022), Xie et al. (2020)
DEAM Pop 45 1802 260 5-10/track OpenSMILE Sorussa et al. (2020), Orjesek et al. (2022), Panwar et al. (2019), M. Zhang et al. (2023)
AMG1608 Pop 30 1608 72 643 MIR Toolbox, YAAFE Chen et al. (2017), X. Hu & Yang (2017), Wang, Wei, et al. (2022)
EMOPIA Piano Solo (pop music) 30-40 387 24 1 annot./track MIDI Toolbox Bhuvana Kumar & Kathiravan (2023)
NTUMIR Famous pop songs 25 60 46 40 annot./track MIR Toolbox, Sound Description Toolbox, MA Toolbox Chin et al. (2018)
Soundtracks Obscure film soundtracks 15 110 NA 116 NA Wang, Wang, et al. (2022)
PSIC3839 Chinese popular 180 3839 NA 87 Librosa Xu et al. (2021)
CH818 Chinese pop 30 818 15 3 MIR Toolbox, PsySound, Chroma Toolbox, Tempogram Toolbox X. Hu & Yang (2017)
Zhang et al. (2015) Chinese pop 30 171 84 10 MA Toolbox, MIR Toolbox, Coversongs J. Zhang et al. (2016)
PMEmo Pop songs Variable 794 6373 457 ComParE 2013 baseline feature set M. Zhang et al. (2023)
NJU-V1 Limited detail Variable 777 Not reported NA (tags) NA Agarwal & Om (2021)
ISMIR-2012 Popular music 30 or 60 2904 54 NA (tags) MIR Toolbox Agarwal & Om (2021)
MIREX2009 Popular Full 297 3 NA Paulus & Klapuri (2009) Yeh et al. (2014)
Million Songs Dataset Pop Full 1,000,000 55 None EchoNest Cao & Park (2023)
Free Music Archive Various Variable >100,000 NA NA NA Koh et al. (2023)
Jamendo Various Variable 10,000 24 NA Metadata Xiao Hu et al. (2022)
Chinese Classical Music Dataset Chinese classical ~30s 500 557 20 Essentia, MIR Toolbox Wang, Wang, et al. (2022)

Notes: \(\dagger\) Used in Álvarez et al. (2023)

Back to top

References

Agarwal, G., & Om, H. (2021). An efficient supervised framework for music mood recognition using autoencoder-based optimised support vector regression model. IET Signal Processing, 15(2), 98–121. https://doi.org/10.1049/sil2.12015
Aljanaki, A., Yang, Y.-H., & Soleymani, M. (2017). Developing a benchmark for emotional analysis of music. PloS One, 12(3), e0173392.
Álvarez, P., Quirós, J. G. de, & Baldassarri, S. (2023). RIADA: A machine-learning based infrastructure for recognising the emotions of spotify songs. International Journal of Interactive Multimedia and Artificial Intelligence, 8(2), 168–181. https://doi.org/10.9781/ijimai.2022.04.002
Bai, J., Feng, L., Peng, J., Shi, J., Luo, K., Li, Z., Liao, L., & Wang, Y. (2016). Dimensional music emotion recognition by machine learning. INTERNATIONAL JOURNAL OF COGNITIVE INFORMATICS AND NATURAL INTELLIGENCE, 10(4), 74–89. https://doi.org/10.4018/IJCINI.2016100104
Bai, J., Luo, K., Peng, J., Shi, J., Wu, Y., Feng, L., Li, J., & Wang, Y. (2017). Music emotions recognition by machine learning with cognitive classification methodologies. International Journal of Cognitive Informatics and Natural Intelligence, 11(4), 80–92. https://doi.org/10.4018/IJCINI.2017100105
Bhuvana Kumar, V., & Kathiravan, M. (2023). Emotion recognition from MIDI musical file using enhanced residual gated recurrent unit architecture. Frontiers in Computer Science, 5. https://doi.org/10.3389/fcomp.2023.1305413
Cao, Y., & Park, J. (2023). The analysis of music emotion and visualization fusing long short-term memory networks under the internet of things. IEEE ACCESS, 11, 141192–141204. https://doi.org/10.1109/ACCESS.2023.3341926
Chen, Y.-A., Wang, J.-C., Yang, Y.-H., & Chen, H. H. (2017). Component tying for mixture model adaptation in personalization of music emotion recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25(7), 1409–1420. https://doi.org/10.1109/TASLP.2017.2693565
Chen, Y.-A., Yang, Y.-H., Wang, J.-C., & Chen, H. (2015). The AMG1608 dataset for music emotion recognition. 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 693–697.
Chin, Y.-H., Wang, J.-C., Wang, J.-C., & Yang, Y.-H. (2018). Predicting the probability density function of music emotion using emotion space mapping. IEEE Transactions on Affective Computing, 9(4), 541–549. https://doi.org/10.1109/TAFFC.2016.2628794
Coutinho, E., & Schuller, B. (2017). Shared acoustic codes underlie emotional communication in music and speech-evidence from deep transfer learning. PLOS ONE, 12(6). https://doi.org/10.1371/journal.pone.0179289
Eyben, F., Wöllmer, M., & Schuller, B. (2010). Opensmile: The munich versatile and fast open-source audio feature extractor. Proceedings of the 18th ACM International Conference on Multimedia, 1459–1462.
Hu, Xiao, Li, F., & Liu, R. (2022). Detecting music-induced emotion based on acoustic analysis and physiological sensing: A multimodal approach. Applied Sciences, 12(18). https://doi.org/10.3390/app12189354
Hu, X., & Yang, Y.-H. (2017). Cross-dataset and cross-cultural music mood prediction: A case on western and chinese pop songs. IEEE Transactions on Affective Computing, 8(2), 228–240. https://doi.org/10.1109/TAFFC.2016.2523503
Koh, E. Y., Cheuk, K. W., Heung, K. Y., Agres, K. R., & Herremans, D. (2023). MERP: A music dataset with emotion ratings and raters’ profile information. SENSORS, 23(1). https://doi.org/10.3390/s23010382
Lartillot, O., & Toiviainen, P. (2007). A matlab toolbox for musical feature extraction from audio. International Conference on Digital Audio Effects, 237, 244.
Markov, K., & Matsui, T. (2014). Music genre and emotion recognition using gaussian processes. IEEE Access, 2, 688–697. https://doi.org/10.1109/ACCESS.2014.2333095
Medina, Y. O., Beltran, J. R., & Baldassarri, S. (2020). Emotional classification of music using neural networks with the MediaEval dataset. PERSONAL AND UBIQUITOUS COMPUTING. https://doi.org/10.1007/s00779-020-01393-4
Orjesek, R., Jarina, R., & Chmulik, M. (2022). End-to-end music emotion variation detection using iteratively reconstructed deep features. Multimedia Tools and Applications, 81(4), 5017–5031. https://doi.org/10.1007/s11042-021-11584-7
Panwar, S., Rad, P., Choo, K.-K. R., & Roopaei, M. (2019). Are you emotional or depressed? Learning about your emotional state from your music using machine learning. JOURNAL OF SUPERCOMPUTING, 75(6, SI), 2986–3009. https://doi.org/10.1007/s11227-018-2499-y
Soleymani, M., Caro, M. N., Schmidt, E. M., Sha, C.-Y., & Yang, Y.-H. (2013). 1000 songs for emotional analysis of music. Proceedings of the 2nd ACM International Workshop on Crowdsourcing for Multimedia, 1–6. https://doi.org/10.1145/2506364.2506365
Sorussa, K., Choksuriwong, A., & Karnjanadecha, M. (2020). Emotion classi cation system for digital music with a cascaded technique. ECTI Transactions on Computer and Information Technology, 14(1), 53–66. https://doi.org/10.37936/ecti-cit.2020141.205317
Wang, X., Wang, L., & Xie, L. (2022). Comparison and analysis of acoustic features of western and chinese classical music emotion recognition based on v‐a model. Applied Sciences, 12(12). https://doi.org/10.3390/app12125787
Wang, X., Wei, Y., & Yang, D. (2022). Cross-cultural analysis of the correlation between musical elements and emotion. COGNITIVE COMPUTATION AND SYSTEMS, 4(2, SI), 116–129. https://doi.org/10.1049/ccs2.12032
Xie, B., Kim, J. C., & Park, C. H. (2020). Musical emotion recognition with spectral feature extraction based on a sinusoidal model with model-based and deep-learning approaches. Applied Sciences, 10(3). https://doi.org/10.3390/app10030902
Xu, L., Sun, Z., Wen, X., Huang, Z., Chao, C., & Xu, L. (2021). Using machine learning analysis to interpret the relationship between music emotion and lyric features. PEERJ COMPUTER SCIENCE, 7. https://doi.org/10.7717/peerj-cs.785
Yang, J. (2021). A novel music emotion recognition model using neural network technology. Frontiers in Psychology, 12. https://doi.org/10.3389/fpsyg.2021.760060
Yeh, C.-H., Tseng, W.-Y., Chen, C.-Y., Lin, Y.-D., Tsai, Y.-R., Bi, H.-I., Lin, Y.-C., & Lin, H.-Y. (2014). Popular music representation: Chorus detection & emotion recognition. Multimedia Tools and Applications, 73(3), 2103–2128. https://doi.org/10.1007/s11042-013-1687-2
Zhang, J., Huang, X., Yang, L., & Nie, L. (2016). Bridge the semantic gap between pop music acoustic feature and emotion: Build an interpretable model. Neurocomputing, 208(SI), 333–341. https://doi.org/10.1016/j.neucom.2016.01.099
Zhang, M., Zhu, Y., Zhang, W., Zhu, Y., & Feng, T. (2023). Modularized composite attention network for continuous music emotion recognition. Multimedia Tools and Applications, 82(5), 7319–7341. https://doi.org/10.1007/s11042-022-13577-6
Manuscript
Features
 
 
  • View source
  • Edit this page
  • Report an issue