Essentia 2.1-beta6-dev Documentation

What is Essentia?

Essentia is an open-source C++ library with Python and JavaScript bindings for audio analysis and audio-based music information retrieval. It is released under the Affero GPLv3 license and is also available under a proprietary license upon request. The library contains an extensive collection of reusable algorithms that implement audio input/output functionality, standard digital signal processing blocks, statistical characterization of data, and a large variety of spectral, temporal, tonal, and high-level music descriptors. Besides, Essentia can be complemented with a wrapper for inference with TensorFlow models and Gaia, a C++ library with Python bindings, which allows similarity search and classification based on audio analysis results (same license terms apply). Both libraries can be used from within Essentia for a high-level music description. We provide examples of pre-trained TensorFlow models and Gaia's SVM classification models out of the box.

Essentia is not a framework, but rather a collection of algorithms (plus some infrastructure) wrapped in a library, designed with a focus on the robustness, performance, and optimality of the provided algorithms, including computational speed and memory usage, as well as ease of use. The flow of the analysis is decided and implemented by the user, while Essentia takes care of the implementation details of the algorithms used. There is a special streaming mode with an advantage of less boilerplate code and less memory consumption in which it is possible to connect algorithms and run them automatically (similarly to PureData or Max/MSP) instead of specifying explicitly the order of execution. We provide some examples with the library, but they should not be considered the only correct way of doing things. A large part of Essentia's algorithms is well-suited for real-time applications.

The provided functionality is easily expandable and allows for both research experiments and the development of large-scale industrial applications. Essentia has served in a large number of research activities conducted at Music Technology Group since 2006. It has been used for music classification, semantic auto-tagging, music similarity and recommendation, visualization and interaction with music, sound indexing, musical instruments detection, cover detection, beat detection, and acoustic analysis of stimuli for neuroimaging studies. You can find a list of highlighted academic publications using Essentia here. Essentia and Gaia have been used extensively in many research projects and industrial applications.

Currently, the following algorithms are included (among others):

  • Audio file input/output: ability to read and write nearly all audio file formats (wav, mp3, ogg, flac, etc.)
  • Standard signal processing blocks: FFT, DCT, frame cutter, windowing, envelope, smoothing
  • Filters (FIR & IIR): low/high/band pass, band reject, DC removal, equal loudness
  • Statistical descriptors: median, mean, variance, power means, raw and central moments, spread, kurtosis, skewness, flatness
  • Time-domain descriptors: duration, loudness, LARM, Leq, Vickers' loudness, zero-crossing-rate, log attack time and other signal envelope descriptors
  • Spectral descriptors: Bark/Mel/ERB bands, MFCC, GFCC, LPC, spectral peaks, complexity, roll-off, contrast, HFC, inharmonicity and dissonance
  • Tonal descriptors: Pitch salience function, predominant melody and pitch, HPCP (chroma) related features, chords, key and scale, tuning frequency
  • Rhythm descriptors: beat detection, BPM, onset detection, rhythm transform, beat loudness
  • Other high-level descriptors: danceability, dynamic complexity, audio segmentation, SVM classifier, TensorFlow wrapper for inference
  • Machine learning models: inference with SVM classifiers and TensorFlow models

The library is cross-platform, supporting Linux and Mac OS X, and partially Windows, iOS, and Android systems. It can also be cross-compiled to JavaScript for its usage on the web, and we provide a dedicated wrapper library, Essentia.js.

The library includes Python bindings (Linux and OSX) and various predefined command-line extractors for music descriptors (Linux, OSX, and Windows), which facilitates its use for fast prototyping and allows setting up research experiments very rapidly. Furthermore, it includes a Vamp plugin (Linux and OSX) to be used with Sonic Visualiser for visualization purposes. There are several third-party extensions to Essentia that allow its use within the frameworks of PureData and Max/MSP, openFrameworks, and Matlab.

To get a quick taste of Essentia, see some of our interactive demos and Python examples, and check our news blog.

Crediting Essentia

Please, credit your use of Essentia properly! If you use the Essentia library in your software, please acknowledge it and specify its origin as http://essentia.upf.edu. If you do some research and publish an article, cite both the Essentia paper [1] and the specific references mentioned in the documentation of the algorithms used. We would also be very grateful if you let us know how you use Essentia by sending an email to mtg@upf.edu.

[1] Bogdanov, D., Wack N., Gómez E., Gulati S., Herrera P., Mayor O., et al. (2013). ESSENTIA: an Audio Analysis Library for Music Information Retrieval. International Society for Music Information Retrieval Conference (ISMIR'13). 493-498.

Contents

Contents and search
For a complete overview of the documentation
Algorithm reference
The detailed documentation for all the algorithms
Doxygen C++ documentation
The documentation for the base classes used in Essentia
Essentia TensorFlow models
The overview of pre-trained machine learning models available in Essentia

Getting started

Introduction
An introduction to Essentia's main concepts
Building and installing Essentia
Instructions to get Essentia running on your computer
Algorithms overview
A quick description of the main algorithms
Python tutorial for beginners
A hands-on introduction to Essentia
Python examples
Examples of using Essentia in Python
Using extractors out-of-box
Quick results with no programming required
Music extractor
Command-line feature extractor
Using machine learning models
Inference with pre-trained TensorFlow models and Gaia SVM classifiers
Web applications with Essentia.js
Using Essentia in the browser
Frequently asked questions
Various tips on how to build and use Essentia

Using Essentia

Design overview
An explanation of Essentia's basic types and classes
Using standard mode
Learn how to write a "standard" extractor
Using streaming mode
Learn how to write a "streaming" extractor
Streaming mode architecture
A description of how the "streaming" mode works

Extending Essentia

Standard algorithms
How to write new "standard" algorithms for Essentia
Streaming algorithms
How to write new "streaming" algorithms for Essentia
AlgorithmComposite algorithms
The inner workings of the composite algorithms
Streaming network execution
How Essentia's streaming scheduler works
Coding guidelines
Good practices to follow when developing new algorithms
Contribute
How to help us in development of Essentia

Applications and licensing

Interactive demos
Web demos using Essentia
Academic research using Essentia
Some of the academic studies using Essentia organized by research topics
Industrial applications
Companies and projects using Essentia
Licensing Essentia
Using Essentia in commercial applications