Academic research using Essentia

The list below highlights some of the academic studies using Essentia organized by research topics.

Music analysis datasets

  • A. Porter, D. Bogdanov, R. Kaye, R. Tsukanov, and X. Serra. AcousticBrainz: a community platform for gathering music information obtained from audio. In 16th International Society for Music Information Retrieval Conference (ISMIR‘15), pages 786-792, 2015.
  • Y. Bayle, P. Hanna, and M. Robine. SATIN: A Persistent Musical Database for Music Information Retrieval. In 15th International Workshop on Content-Based Multimedia Indexing (CBMI‘17), 2017.

Music classification

  • D. Bogdanov, A. Porter, J. Urbano, and H. Schreiber. The MediaEval 2017 AcousticBrainz Genre Task: Content-based Music Genre Recognition from Multiple Sources. In MediaEval 2017 Multimedia Benchmark Workshop (MediaEval‘17), 2017.
  • K. Koutini, A. Imenina, M. Dorfer, A. R. Gruber, and M. Schedl. MediaEval 2017 AcousticBrainz Genre Task: Multilayer Perceptron Approach. In MediaEval 2017 Multimedia Benchmark Workshop (MediaEval‘17), 2017.
  • D. Bogdanov, M. Haro, F. Fuhrmann, A. Xambó, E. Gómez, and P. Herrera. Semantic audio content-based music recommendation and visualization based on user preference examples. Information Processing & Management, 49(1):13–33, 2013.
  • N. Wack, E. Guaus, C. Laurier, R. Marxer, D. Bogdanov, J. Serrà, and P. Herrera. Music type groupers (MTG): generic music classification algorithms. In Music Information Retrieval Evaluation Exchange (MIREX’09), 2009.
  • N. Wack, C. Laurier, O. Meyers, R. Marxer, D. Bogdanov, J. Serra, E. Gomez, and P. Herrera. Music classification using high-level models. In Music Information Retrieval Evaluation Exchange (MIREX’10), 2010.
  • C. Laurier. Automatic Classification of Musical Mood by Content- Based Analysis. PhD thesis, UPF, Barcelona, Spain, 2011.
  • C. Laurier, O. Meyers, J. Serrà, M. Blech, P. Herrera, and X. Serra. Indexing music by mood: design and integration of an automatic content-based annotator. Multimedia Tools and Applications, 48(1):161–184, 2009.
  • C. Johnson-Roberson. Content-Based Genre Classification and Sample Recognition Using Topic Models. Master Thesis, Brown University, Providence, USA, 2017.

Semantic autotagging

  • M. Sordo. Semantic Annotation of Music Collections: A Computational Approach. PhD thesis, UPF, Barcelona, Spain, 2012.
  • Y. Yang, D. Bogdanov, P. Herrera, and M. Sordo. Music Retagging Using Label Propagation and Robust Principal Component Analysis. In International World Wide Web Conference (WWW’12), International Workshop on Advances in Music Information Research (AdMIRe’12), 2012.

Music similarity and recommendation

  • D. Bogdanov. From music similarity to music recommendation: Com- putational approaches based on audio and metadata analysis. PhD thesis, UPF, Barcelona, Spain, 2013.
  • D. Bogdanov, M. Haro, F. Fuhrmann, A. Xambó, E. Gómez, and P. Herrera. Semantic audio content-based music recommendation and visualization based on user preference examples. Information Pro- cessing & Management, 49(1):13–33, Jan. 2013.
  • D. Bogdanov, J. Serrà, N. Wack, P. Herrera, and X. Serra. Unifying low-level and high-level music similarity measures. IEEE Transactiions on Multimedia, 13(4):687–701, 2011.
  • O. Celma, P. Cano, and P. Herrera. Search sounds an audio crawler focused on weblogs. In 7th International Conference on Music Information Retrieval (ISMIR‘06), 2006.
  • J. Kaitila. A content-based music recommender system. Master Thesis, University of Tampere, Finland, 2017.
  • K. Yadati, C. Liem, M. Larson, and A. Hanjalic. On the Automatic Identification of Music for Common Activities. In 2017 ACM on International Conference on Multimedia Retrieval, pages 192-200, 2017.

 Emotion detection

  • J. Grekow. Comparative Analysis of Musical Performances by Using Emotion Tracking. In book: Foundations of Intelligent Systems, pages 175-184, 2017.
  • J. Grekow. Audio features dedicated to the detection of arousal and valence in music recordings. In IEEE International Conference on INnovations in Intelligent SysTems and Applications (INISTA), pages 40-44, 2017.
  • J. Grekow. Music Emotion Maps in Arousal-Valence Space. In IFIP International Conference on Computer Information Systems and Industrial Management, pages 697-706, 2016.
  • J. Grekow. Audio Features Dedicated to the Detection of Four Basic Emotions. In IFIP International Conference on Computer Information Systems and Industrial Management, pages 583-591, 2015.
  • J. Grekow. Emotion Detection Using Feature Extraction Tools. In International Symposium on Methodologies for Intelligent Systems, pages 267-272, 2015.
  • T. Pellegrini, and V. Barriere. Time-continuous estimation of emotion in music with recurrent neural networks. In MediaEval 2015 Multimedia Benchmark Workshop (MediaEval‘15), 2015.
  • A. Aljanaki, F. Wiering, and R. C. Veltkamp. MediaEval 2015: A Segmentation-based Approach to Continuous Emotion Tracking. In MediaEval 2015 Multimedia Benchmark Workshop (MediaEval‘15), 2015.

Visualization and interaction with music

  • J. H. P. Ono, F. Sikansi, D. C. Corrêa, F. V. Paulovich, A. Paiva, and L. G. Nonato. Concentric RadViz: visual exploration of multi-task classification. In 28th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), pages 165-172, 2015.
  • D. Bogdanov. From music similarity to music recommendation: Computational approaches based on audio and metadata analysis. PhD thesis, UPF, Barcelona, Spain, 2013.
  • D. Bogdanov, M. Haro, F. Fuhrmann, A. Xambó, E. Gómez, and P. Herrera. Semantic audio content-based music recommendation and visualization based on user preference examples. Information Processing & Management, 49(1):13–33, 2013.
  • E. Maestre, P. Papiotis, M. Marchini, Q. Llimona, and O. Mayor. Online Access and Visualization of Enriched Multimodal Representations of Music Performance Recordings: the Quartet Dataset and the Repovizz System. IEEE Multimedia, 24(1):24-34, 2017.
  • C. F. Julià and S. Jordà. SongExplorer: a tabletop application for exploring large collections of songs. In International Society for Music Information Retrieval Conference (ISMIR’09), 2009.
  • C. Laurier, M. Sordo, and P. Herrera. Mood cloud 2.0: Music mood browsing based on social networks. In International Society for Music Information Retrieval Conference (ISMIR’09), 2009.
  • O. Mayor, J. Llop, and E. Maestre. RepoVizz: A multimodal on-line database and browsing tool for music performance research. In International Society for Music Information Retrieval Conference (ISMIR’11), 2011.
  • M. Sordo, G. K. Koduri, S. Şentürk, S. Gulati, and X. Serra. A musically aware system for browsing and interacting with audio music collections. In The 2nd CompMusic Workshop, 2012.
  • A. Augello, E. Cipolla, I. Infantino, A. Manfre, G. Pilato, and F. Vella. Creative Robot Dance with Variational Encoder. In International Conference on Computational Creativity, 2017.
  • A. Augello, I. Infantino, A. Manfrè, G. Pilato, F. Vella, and A. Chella. Creation and cognition for humanoid live dancing. Robotics and Autonomous Systems, 86:128-137, 2016
  • F. Kraemer, I. Rodriguez, O. Parra, T. Ruiz, and E. Lazkano. Minstrel robots: Body language expression through applause evaluation. In IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids), pages 332-337, 2016
  • O. Alemi, J. Françoise, and P. Pasquier. GrooveNet: Real-Time Music-Driven Dance Movement Generation using Artificial Neural Networks. In Workshop on Machine Learning for Creativity, 23rd ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2017.
  • J. Buhmann, B. Moens, V. Lorenzoni, and M. Leman. Shifting the Musical Beat to Influence Running Cadence. In European Society for Cognitive Sciences Of Music (ESCOM‘17), 2017.
  • J. Buhmann. Effects of music-based biofeedback on walking and running. PhD Thesis, Ghent University, Belgium, 2017.

Sound indexing, music production, and intelligent audio processing

  • H. Ordiales, M. L. Bruno. Sound recycling from public databases. In 12th International Audio Mostly Conference on Augmented and Participatory Sound and Music Experiences (AM’17), 2017.
  • S. Parekh, F. Font, and X. Serra. Improving Audio Retrieval through Loudness Profile Categorization. In IEEE International Symposium on Multimedia (ISM), pages 565-568, 2016.
  • D. Moffat, D. Ronan, and J. D. Reiss. Unsupervised taxonomy of sound effects. In 20th International Conference on Digital Audio Effects (DAFx-17), 2017.
  • S. Böck. Event Detection in Musical Audio. PhD Thesis, Johannes Kepler University, Linz, Austria, 2016.
  • J. Shier, K. McNally and G. Tzanetakis. Sieve: A plugin for the automatic classification and intelligent browsing of kick and snare samples. In 3rd Workshop on Intelligent Music Production, 2017.
  • E. T. Chourdakis, and J. D. Reiss. A Machine-Learning Approach to Application of Intelligent Artificial Reverberation. Journal of the Audio Engineering Society, 65(1/2):56-65, 2017.
  • O. Campbell, C. Roads, A. Cabrera, M. Wright, and Y. Visell. ADEPT: A Framework for Adaptive Digital Audio Effects. In 2nd AES Workshop on Intelligent Music Production, 2016.
  • I. Jordal. Evolving artificial neural networks for cross-adaptive audio effects. Master Thesis, Norwegian University of Science and Technology, 2017.
  • C. Ó. Nuanáin, P. Herrera, and S. Jordá. Rhythmic Concatenative Synthesis for Electronic Music: Techniques, Implementation, and Evaluation. Computer Music Journal, 41(2):21-37, 2017.
  • C. Ó. Nuanáin, S. Jordà, and P. Herrera. An Interactive Software Instrument for Real-time Rhythmic Concatenative Synthesis. In New Interfaces for Musical Expression, 2016.
  • C. O. Nuanáin, M. Hermant, A. Faraldo, and E. Gómez. The Eear: Building a real-time MIR-based instrument from a hack. In 16th International Society for Music Information Retrieval Conference (ISMIR‘15), Late-Breaking/Demo Session.
  • J. B. Bonmati. DJ Codo Nudo: a novel method for seamless transition between songs for electronic music. Master Thesis, Universitat Pompeu Fabra, Barcelona, Spain, 2016.
  • F. Font, and X. Serra. Tempo Estimation for Music Loops and a Simple Confidence Measure. In 17th International Society for Music Information Retrieval Conference (ISMIR‘16), pages 269-275, 2016.
  • F. Font. Tag recommendation using folksonomy information for online sound sharing platforms. PhD Thesis. Universitat Pompeu Fabra, Barcelona, Spain, 2015.
  • G. Bandiera, O. Romani Picas, H. Tokuda, W. Hariya, K. Oishi, and X. Serra. Good-sounds. org: A Framework to Explore Goodness in Instrumental Sounds. In 17th International Society for Music Information Retrieval Conference (ISMIR‘16), pages 414-419, 2016.
  • O. Romani Picas, H. Parra Rodriguez, D. Dabiri, H. Tokuda, W. Hariya, K. Oishi, and X. Serra. A real-time system for measuring sound goodness in instrumental sounds. In Audio Engineering Society Convention 138, 2015.
  • K. Narang, and R. Preeti. Acoustic Features For Determining Goodness of Tabla Strokes. In 18th International Society for Music Information Retrieval Conference (ISMIR‘17), 2017.
  • Y. J. Luo, L. Su, Y. H. Yang, and T. S. Chi. Detection of Common Mistakes in Novice Violin Playing. In 16th International Society for Music Information Retrieval Conference (ISMIR‘15), pages 316-322, 2015.
  • J. Salamon, and J. P. Bello. Deep convolutional neural networks and data augmentation for environmental sound classification. IEEE Signal Processing Letters, 24(3):279-283, 2017.
  • J. Salamon, and J. P. Bello. Unsupervised feature learning for urban sound classification. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP‘15), pages 171-175, 2015.
  • J. Salamon, and J. P. Bello. Feature learning with deep scattering for urban sound analysis. In 23rd European Signal Processing Conference (EUSIPCO), pages 724-728, IEEE, 2015.
  • M. Haro, J. Serrà, P. Herrera, and A. Corral. Zipf’s law in short-time timbral codings of speech, music, and environmental sound signals. PLoS ONE, 7(3):e33993, 2012.
  • J. Janer, M. Haro, G. Roma, T. Fujishima, and N. Kojima. Sound object classification for symbolic audio mosaicing: A proof-of-concept. In Sound and Music Computing Conference (SMC’09), pages 297–302, 2009.
  • G. Roma, J. Janer, S. Kersten, M. Schirosa, P. Herrera, and X. Serra. Ecological acoustics perspective for content-based retrieval of environmental sounds. EURASIP Journal on Audio, Speech, and Music Processing, 2010.

Instrument detection

  • K. A. Pati, and A. Lerch. A Dataset and Method for Guitar Solo Detection in Rock Music. In 2017 AES International Conference on Semantic Audio, 2017.
  • F. Fuhrmann and P. Herrera. Quantifying the relevance of locally extracted information for musical instrument recognition from entire pieces of music. In International Society for Music Information Retrieval Conference (ISMIR’11), 2011.
  • F. Fuhrmann, P. Herrera, and X. Serra. Detecting solo phrases in music using spectral and pitch-related descriptors. Journal of New Music Research, 38(4):343–356, 2009.

Music segmentation

  • C. Bohak, and M. Marolt. Probabilistic segmentation of folk music recordings. Mathematical Problems in Engineering, 2016.
  • A. Aljanaki, F. Wiering, and R. C. Veltkamp. Emotion based segmentation of musical audio. In 16th Conference of the International Society for Music Information Retrieval (ISMIR‘15), pages 770-776, 2015.

Cover detection

  • C. J. Tralie. Early MFCC And HPCP Fusion for Robust Cover Song Identification. arXiv preprint arXiv:1707.04680, 2017.
  • J. Serrà, E. Gómez, P. Herrera, and X. Serra. Chroma binary similarity and local alignment applied to cover song identification. IEEE Transactions on Audio, Speech, and Language Processing, 16(6):1138–1151, 2008.

Key detection

  • Á. Faraldo, S. Jordà, and P. Herrera. A Multi-Profile Method for Key Estimation in EDM. In 2017 AES International Conference on Semantic Audio, 2017.

Music transcription

  • K. Ullrich, and E. van der Wel. Music transcription with convolutional sequence-to-sequence models. In International Society for Music Information Retrieval (ISMIR‘17), 2017.

Computational musicology

  • C. C. Liem, and A. Hanjalic. Comparative analysis of orchestral performance recordings: An image-based approach. In 16th International Society for Music Information Retrieval Conference (ISMIR‘15), 2015.
  • R. C. Repetto, R. Gong, N. Kroher, and X. Serra. Comparison of the Singing Style of Two Jingju Schools. In 16th International Society for Music Information Retrieval Conference (ISMIR‘15), 2015.
  • A. Karakurt, S. Şentürk, and X. Serra. MORTY: A Toolbox for Mode Recognition and Tonic Identification. In Proceedings of the 3rd International workshop on Digital Libraries for Musicology, pages 9-16, 2016.
  • A. Haron. A step towards automatic identification of influene: Lick detection in a musical passage. In 15th International Society for Music Information Retrieval Conference (ISMIR‘14) Late-Breaking/Demo Session.

 Melodic analysis

  • Y. P. Chen, L. Su, and Y. H. Yang. Electric Guitar Playing Technique Detection in Real-World Recording Based on F0 Sequence Pattern Recognition. In 16th International Society for Music Information Retrieval Conference (ISMIR‘15), pages 708-714, 2015.
  • N. Kroher, J. M. Díaz-Báñez, J. Mora, and E. Gómez. Corpus COFLA: a research corpus for the computational study of flamenco music. Journal on Computing and Cultural Heritage (JOCCH), 9(2), 10, 2016.
  • S. Balke, J. Driedger, J. Abeßer, C. Dittmar, and M. Müller. Towards Evaluating Multiple Predominant Melody Annotations in Jazz Recordings. In 17th International Society for Music Information Retrieval Conference (ISMIR‘16), pages 246-252, 2016.
  • S. I. Giraldo. Computational modelling of expressive music performance in jazz guitar: a machine learning approach. PhD Thesis, Universitat Pompeu Fabra, Barcelona, Spain, 2016.
  • S. Giraldo, and R. Ramirez. Optimizing melodic extraction algorithm for jazz guitar recordings using genetic algorithms. In Joint Conference ICMC-SMC, pages 25-27, 2014.
  • R. C. Repetto, and X. Serra. Creating a Corpus of Jingju (Beijing Opera) Music and Possibilities for Melodic Analysis. In 15th International Society for Music Information Retrieval Conference (ISMIR‘14), pages 313-318, 2014.
  • S. Zhang, R. C. Repetto, and X. Serra. Study of the Similarity between Linguistic Tones and Melodic Pitch Contours in Beijing Opera Singing. In 15th International Society for Music Information Retrieval Conference (ISMIR‘14), pages 343-348, 2014.
  • B. Uyar, H. S. Atli, S. Şentürk, B. Bozkurt, and X. Serra. A corpus for computational research of Turkish makam music. In 1st International Workshop on Digital Libraries for Musicology, pages 1-7, ACM, 2014.
  • S. Şentürk, A. Holzapfel, and X. Serra. Linking scores and audio recordings in makam music of Turkey. Journal of New Music Research, 43(1):34-52, 2014.
  • S. Sentürk, S. Gulati, and X. Serra. Score Informed Tonic Identification for Makam Music of Turkey. In 14th International Society for Music Information Retrieval Conference (ISMIR‘13), pages 175-180, 2013.
  • K. K. Ganguli, S. Gulati, X. Serra, and P. Rao. Data-Driven Exploration of Melodic Structure in Hindustani Music. In 17th International Society for Music Information Retrieval Conference (ISMIR‘16), pages 605-611, 2016.
  • S. Gulati, J. Serra, and X. Serra. Improving Melodic Similarity in Indian Art Music Using Culture-Specific Melodic Characteristics. In 16th International Society for Music Information Retrieval Conference (ISMIR‘15), pages 680-686, 2015.
  • S. Gulati, J. Serra, and X. Serra. An evaluation of methodologies for melodic similarity in audio recordings of indian art music. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP‘15), pages 678-682, 2015.
  • S. Gulati, J. Serra, V. Ishwar, and X. Serra. Mining melodic patterns in large audio collections of indian art music. In 10th International Conference on Signal-Image Technology and Internet-Based Systems (SITIS‘14), pages 264-271, IEEE, 2014.
  • S. Gulati, A. Bellur, J. Salamon, V. Ishwar, H. A. Murthy, and X. Serra. Automatic tonic identification in Indian art music: approaches and evaluation. Journal of New Music Research, 43(1):53-71, 2014.
  • G. K. Koduri, S. Gulati, P. Rao, and X. Serra. Raga recognition based on pitch distribution methods. Journal of New Music Research, 41(4):337–350, 2012.
  • G. K. Koduri, J. Serrà, and X. Serra. Characterization of intonation in carnatic music by parametrizing pitch histograms. In International Society for Music Information Retrieval Conference (ISMIR’12), pages 199–204, 2012.

Rhythmic analysis

  • A. Srinivasamurthy, and X. Serra. A supervised approach to hierarchical metrical cycle tracking from audio music recordings. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP‘14), pages 5217-5221, 2014.

Bioacoustic analyis

  • J. Salamon, J. P. Bello, A. Farnsworth, and S. Kelling. Fusing shallow and deep learning for bioacoustic bird species classification. In IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP‘17), pages 141-145, 2017.
  • J. Salamon, J. P. Bello, A. Farnsworth, M. Robbins, S. Keen, H. Klinck, and S. Kelling. Towards the automatic classification of avian flight calls for bioacoustic monitoring. PloS one, 11(11), e0166866, 2016.
  • C. Lopez-Tello. Acoustic Detection, Source Separation, and Classification Algorithms for Unmanned Aerial Vehicles in Wildlife Monitoring and Poaching. Master Thesis, University of Nevada, Las Vegas, USA, 2016

Acoustic analysis for medical and neuroimaging studies

  • S. Koelsch, S. Skouras, T. Fritz, P. Herrera, C. Bonhage, M. Kuessner, and A. M. Jacobs. Neural correlates of music-evoked fear and joy: The roles of auditory cortex and superficial amygdala. Neuroimage, 81:49-60, 2013.
  • E. Vaiciukynas, A. Verikas, A. Gelzinis, M. Bacauskiene, K. Vaskevicius, V. Uloza, E. Padervinskis, and J. Ciceliene. Fusing Various Audio Feature Sets for Detection of Parkinson’s Disease from Sustained Voice and Speech Recordings. In International Conference on Speech and Computer (SPECOM‘16), pages 328-337, 2016
  • F. A. Araújo, F. L. Brasil, A. C. L. Santos, L. D. S. B. Junior, S. P. F. Dutra, and C. E. C. F. Batista. Auris System: Providing vibrotactile feedback for hearing impaired population. BioMed Research International, 2017, 2017.
  • M. A. Casey. Music of the 7Ts: Predicting and Decoding Multivoxel fMRI Responses with Acoustic, Schematic, and Categorical Music Features. Frontiers in psychology, 8, 2017.