Algorithms reference¶
Here is the complete list of algorithms which you can access from the Python interface.
The C++ interface allows access to the same algorithms, and also some more which are templated and hence are not available in python.
Envelope/SFX¶
AfterMaxToBeforeMaxEnergyRatio¶
Computes the ratio between the pitch energy after the pitch maximum and the pitch energy before the pitch maximum
DerivativeSFX¶
Computes two descriptors that are based on the derivative of a signal envelope
Envelope¶
Computes the envelope of a signal by applying a non-symmetric lowpass filter on a signal
FlatnessSFX¶
Calculates the flatness coefficient of a signal envelope
LogAttackTime¶
Computes the log (base 10) of the attack time of a signal envelope
MaxToTotal¶
Computes the ratio between the index of the maximum value of the envelope of a signal and the total length of the envelope
MinToTotal¶
Computes the ratio between the index of the minimum value of the envelope of a signal and the total length of the envelope
StrongDecay¶
Computes the Strong Decay of an audio signal
TCToTotal¶
Calculates the ratio of the temporal centroid to the total length of a signal envelope
Filters¶
AllPass¶
Implements a IIR all-pass filter of order 1 or 2
BandPass¶
Implements a 2nd order IIR band-pass filter
BandReject¶
Implements a 2nd order IIR band-reject filter
DCRemoval¶
Removes the DC offset from a signal using a 1st order IIR highpass filter
EqualLoudness¶
Implements an equal-loudness filter
HighPass¶
Implements a 1st order IIR high-pass filter
IIR¶
Implements a standard IIR filter
LowPass¶
Implements a 1st order IIR low-pass filter
MaxFilter¶
Implements a maximum filter for 1d signal using van Herk/Gil-Werman (HGW) algorithm
MedianFilter¶
Computes the median filtered version of the input signal giving the kernel size as detailed in [1]
MovingAverage¶
Implements a FIR Moving Average filter
Input/output¶
AudioLoader¶
Loads the single audio stream contained in a given audio or video file
AudioOnsetsMarker¶
Creates a wave file in which a given audio signal is mixed with a series of time onsets
AudioWriter¶
Encodes an input stereo signal into a stereo audio file
EasyLoader¶
Loads the raw audio data from an audio file, downmixes it to mono and normalizes using replayGain
EqloudLoader¶
Loads the raw audio data from an audio file, downmixes it to mono and normalizes using replayGain and equal-loudness filter
FileOutput¶
Stores alphanumeric data into text or binary files
MetadataReader¶
Loads the metadata tags from an audio file as well as outputs its audio properties
MonoLoader¶
Loads the raw audio data from an audio file and downmixes it to mono
MonoWriter¶
Writes a mono audio stream to a file
VectorInput¶
Can be used as the starting point of a streaming network
YamlInput¶
(standard)
Deserializes a file formatted in YAML to a Pool
YamlOutput¶
(standard)
Emits a YAML or JSON representation of a Pool
Standard¶
AutoCorrelation¶
Computes the autocorrelation vector of a signal
BPF¶
Implements a break point function which linearly interpolates between discrete xy-coordinates to construct a continuous function
BinaryOperator¶
Performs basic arithmetical operations element by element given two arrays
BinaryOperatorStream¶
Performs basic arithmetical operations element by element given two arrays
Clipper¶
Clips the input signal to fit its values into a specified interval
ConstantQ¶
Computes Constant Q Transform using the FFT for fast calculation
CrossCorrelation¶
Computes the cross-correlation vector of two signals
CubicSpline¶
Computes the second derivatives of a piecewise cubic spline
DCT¶
Computes the Discrete Cosine Transform of an array
Derivative¶
Returns the first-order derivative of an input signal
FFT¶
Computes the positive complex short-term Fourier transform (STFT) of an array using the FFT algorithm
FFTC¶
Computes the complex short-term Fourier transform (STFT) of a complex array using the FFT algorithm
FrameBuffer¶
Buffers input non-overlapping audio frames into longer overlapping frames with a hop sizes equal to input frame size
FrameCutter¶
Slices the input buffer into frames
FrameGenerator¶
(standard)
The FrameGenerator is a Python generator for the FrameCutter algorithm
FrameToReal¶
Converts a sequence of input audio signal frames into a sequence of audio samples
IDCT¶
Computes the Inverse Discrete Cosine Transform of an array
IFFT¶
Calculates the inverse short-term Fourier transform (STFT) of an array of complex values using the FFT algorithm
IFFTC¶
Calculates the inverse short-term Fourier transform (STFT) of an array of complex values using the FFT algorithm
MinMax¶
Calculates the minimum or maximum value of an array
MonoMixer¶
Downmixes the signal into a single channel given a stereo signal
Multiplexer¶
Returns a single vector from a given number of real values and/or frames
NSGConstantQ¶
Computes a constant Q transform using non stationary Gabor frames and returns a complex time-frequency representation of the input vector
NSGConstantQStreaming¶
Computes a constant Q transform using non stationary Gabor frames and returns a complex time-frequency representation of the input vector
NSGIConstantQ¶
Computes an inverse constant Q transform using non stationary Gabor frames and returns a complex time-frequency representation of the input vector
NoiseAdder¶
Adds noise to an input signal
OverlapAdd¶
Returns the output of an overlap-add process for a sequence of frames of an audio signal
PeakDetection¶
Detects local maxima (peaks) in an array
PoolToTensor¶
Retrieve a tensor from a pool under a given namespace
RealAccumulator¶
Takes a stream of Real values and outputs them as a single vector when the end of the stream is reached
Resample¶
Resamples the input signal to the desired sampling rate
Scale¶
Scales the audio by the specified factor using clipping if required
Slicer¶
Splits an audio signal into segments given their start and end times
Spline¶
Evaluates a piecewise spline of type b, beta or quadratic
StereoDemuxer¶
Outputs left and right channel separately given a stereo signal
StereoMuxer¶
Outputs a stereo signal given left and right channel separately
StereoTrimmer¶
Extracts a segment of a stereo audio signal given its start and end times
TensorNormalize¶
Performs normalization over a tensor
TensorToPool¶
Inserts a tensor into a pool under a given namespace
TensorToVectorReal¶
Streams the frames of the input tensor along a given namespace
TensorTranspose¶
Performs transpositions over the axes of a tensor
Trimmer¶
Extracts a segment of an audio signal given its start and end times
UnaryOperator¶
Performs basic arithmetical operations element by element given an array
UnaryOperatorStream¶
Performs basic arithmetical operations element by element given an array
VectorRealAccumulator¶
Takes a stream of Real values and outputs them as a single vector when the end of the stream is reached
VectorRealToTensor¶
Generates tensors out of a stream of input frames
WarpedAutoCorrelation¶
Computes the warped auto-correlation of an audio signal
Welch¶
estimates the Power Spectral Density of the input signal using the Welch’s method [1]
Windowing¶
Applies windowing to an audio signal
ZeroCrossingRate¶
Computes the zero-crossing rate of an audio signal
Spectral¶
BFCC¶
Computes the bark-frequency cepstrum coefficients of a spectrum
BarkBands¶
Computes energy in Bark bands of a spectrum
ERBBands¶
Computes energies/magnitudes in ERB bands of a spectrum
EnergyBand¶
Computes energy in a given frequency band of a spectrum including both start and stop cutoff frequencies
EnergyBandRatio¶
Computes the ratio of the spectral energy in the range [startFrequency, stopFrequency] over the total energy
FlatnessDB¶
Computes the flatness of an array, which is defined as the ratio between the geometric mean and the arithmetic mean converted to dB scale
Flux¶
Computes the spectral flux of a spectrum
FrequencyBands¶
Computes energy in rectangular frequency bands of a spectrum
GFCC¶
Computes the Gammatone-frequency cepstral coefficients of a spectrum
HFC¶
Computes the High Frequency Content of a spectrum
LPC¶
Computes Linear Predictive Coefficients and associated reflection coefficients of a signal
LogSpectrum¶
Computes spectrum with logarithmically distributed frequency bins
MFCC¶
Computes the mel-frequency cepstrum coefficients of a spectrum
MaxMagFreq¶
Computes the frequency with the largest magnitude in a spectrum
MelBands¶
Computes energy in mel bands of a spectrum
Panning¶
Characterizes panorama distribution by comparing spectra from the left and right channels
PowerSpectrum¶
Computes the power spectrum of an array of Reals
RollOff¶
Computes the roll-off frequency of a spectrum
SpectralCentroidTime¶
Computes the spectral centroid of a signal in time domain
SpectralComplexity¶
Computes the spectral complexity of a spectrum
SpectralContrast¶
Computes the Spectral Contrast feature of a spectrum
SpectralPeaks¶
Extracts peaks from a spectrum
SpectralWhitening¶
Performs spectral whitening of spectral peaks of a spectrum
Spectrum¶
Computes the magnitude spectrum of an array of Reals
SpectrumToCent¶
Computes energy in triangular frequency bands of a spectrum equally spaced on the cent scale
StrongPeak¶
Computes the Strong Peak of a spectrum
TensorflowInputFSDSINet¶
Computes mel bands from an audio frame with the specific parametrization required by the FSD-SINet models
TensorflowInputMusiCNN¶
Computes mel-bands specific to the input of MusiCNN-based models
TensorflowInputTempoCNN¶
Computes mel-bands specific to the input of TempoCNN-based models
TensorflowInputVGGish¶
Computes mel-bands specific to the input of VGGish-based models
TriangularBands¶
Computes energy in triangular frequency bands of a spectrum
TriangularBarkBands¶
Computes energy in the bark bands of a spectrum
Rhythm¶
BeatTrackerDegara¶
Estimates the beat positions given an input signal
BeatTrackerMultiFeature¶
Estimates the beat positions given an input signal
Beatogram¶
Filters the loudness matrix given by BeatsLoudness algorithm in order to keep only the most salient beat band representation
BeatsLoudness¶
Computes the spectrum energy of beats in an audio signal given their positions
BpmHistogram¶
Analyzes predominant periodicities in a signal given its novelty curve [1] (see NoveltyCurve algorithm) or another onset detection function (see OnsetDetection and OnsetDetectionGlobal)
BpmHistogramDescriptors¶
Computes beats per minute histogram and its statistics for the highest and second highest peak
BpmRubato¶
Extracts the locations of large tempo changes from a list of beat ticks
Danceability¶
Estimates danceability of a given audio signal
HarmonicBpm¶
Extracts bpms that are harmonically related to the tempo given by the ‘bpm’ parameter
LoopBpmConfidence¶
Takes an audio signal and a BPM estimate for that signal and predicts the reliability of the BPM estimate in a value from 0 to 1
LoopBpmEstimator¶
Estimates the BPM of audio loops
Meter¶
Estimates the time signature of a given beatogram by finding the highest correlation between beats
NoveltyCurve¶
Computes the “novelty curve” (Grosche & Müller, 2009) onset detection function
NoveltyCurveFixedBpmEstimator¶
(standard)
Outputs a histogram of the most probable bpms assuming the signal has constant tempo given the novelty curve
OnsetDetection¶
Computes various onset detection functions
OnsetDetectionGlobal¶
Computes various onset detection functions
OnsetRate¶
Computes the number of onsets per second and their position in time for an audio signal
Onsets¶
Computes onset positions given various onset detection functions
PercivalBpmEstimator¶
Estimates the tempo in beats per minute (BPM) from an input signal as described in [1]
PercivalEnhanceHarmonics¶
Implements the ‘Enhance Harmonics’ step as described in [1]
PercivalEvaluatePulseTrains¶
Implements the ‘Evaluate Pulse Trains’ step as described in [1]
RhythmDescriptors¶
Computes rhythm features (bpm, beat positions, beat histogram peaks) for an audio signal
RhythmExtractor¶
Estimates the tempo in bpm and beat positions given an audio signal
RhythmExtractor2013¶
Extracts the beat positions and estimates their confidence as well as tempo in bpm for an audio signal
RhythmTransform¶
Implements the rhythm transform
SingleBeatLoudness¶
Computes the spectrum energy of a single beat across the whole frequency range and on each specified frequency band given an audio segment
SuperFluxExtractor¶
Detects onsets given an audio signal using SuperFlux algorithm
SuperFluxNovelty¶
Onset detection function for Superflux algorithm
SuperFluxPeaks¶
Detects peaks of an onset detection function computed by the SuperFluxNovelty algorithm
TempoCNN¶
Estimates tempo using TempoCNN-based models
TempoScaleBands¶
Computes features for tempo tracking to be used with the TempoTap algorithm
TempoTap¶
Estimates the periods and phases of a periodic signal, represented by a sequence of values of any number of detection functions, such as energy bands, onsets locations, etc
TempoTapDegara¶
Estimates beat positions given an onset detection function
TempoTapMaxAgreement¶
Outputs beat positions and confidence of their estimation based on the maximum mutual agreement between beat candidates estimated by different beat trackers (or using different features)
TempoTapTicks¶
Builds the list of ticks from the period and phase candidates given by the TempoTap algorithm
Math¶
CartesianToPolar¶
Converts an array of complex numbers from cartesian to polar form
Magnitude¶
Computes the absolute value of each element in a vector of complex numbers
PolarToCartesian¶
Converts an array of complex numbers from polar to cartesian form
Statistics¶
CentralMoments¶
Extracts the 0th, 1st, 2nd, 3rd and 4th central moments of an array
Centroid¶
Computes the centroid of an array
Crest¶
Computes the crest of an array
Decrease¶
Computes the decrease of an array defined as the linear regression coefficient
DistributionShape¶
Computes the spread (variance), skewness and kurtosis of an array given its central moments
Energy¶
Computes the energy of an array
Entropy¶
Computes the Shannon entropy of an array
Flatness¶
Computes the flatness of an array, which is defined as the ratio between the geometric mean and the arithmetic mean
GeometricMean¶
Computes the geometric mean of an array of positive values
Histogram¶
Computes a histogram
InstantPower¶
Computes the instant power of an array
Mean¶
Computes the mean of an array
Median¶
Computes the median of an array
PoolAggregator¶
Performs statistical aggregation on a Pool and places the results of the aggregation into a new Pool
PowerMean¶
Computes the power mean of an array
RMS¶
Computes the root mean square (quadratic mean) of an array
RawMoments¶
Computes the first 5 raw moments of an array
SingleGaussian¶
Estimates the single gaussian distribution for a matrix of feature vectors
Variance¶
Computes the variance of an array
Viterbi¶
Estimates the most-likely path by Viterbi algorithm
Tonal¶
ChordsDescriptors¶
Given a chord progression this algorithm describes it by means of key, scale, histogram, and rate of change
ChordsDetection¶
Estimates chords given an input sequence of harmonic pitch class profiles (HPCPs)
ChordsDetectionBeats¶
(standard)
Estimates chords using pitch profile classes on segments between beats
Chromagram¶
Computes the Constant-Q chromagram using FFT
Dissonance¶
Computes the sensory dissonance of an audio signal given its spectral peaks
HPCP¶
Computes a Harmonic Pitch Class Profile (HPCP) from the spectral peaks of a signal
HarmonicPeaks¶
Finds the harmonic peaks of a signal given its spectral peaks and its fundamental frequency
HighResolutionFeatures¶
Computes high-resolution chroma features from an HPCP vector
Inharmonicity¶
Calculates the inharmonicity of a signal given its spectral peaks
Key¶
Computes key estimate given a pitch class profile (HPCP)
KeyExtractor¶
Extracts key/scale for an audio signal
NNLSChroma¶
Extracts treble and bass chromagrams from a sequence of log-frequency spectrum frames
OddToEvenHarmonicEnergyRatio¶
Computes the ratio between a signal’s odd and even harmonic energy given the signal’s harmonic peaks
PitchSalience¶
Computes the pitch salience of a spectrum
SpectrumCQ¶
Computes the magnitude of the Constant-Q spectrum
TonalExtractor¶
Computes tonal features for an audio signal
TonicIndianArtMusic¶
(standard)
Estimates the tonic frequency of the lead artist in Indian art music
Tristimulus¶
Calculates the tristimulus of a signal given its harmonic peaks
TuningFrequency¶
Estimates the tuning frequency give a sequence/set of spectral peaks
TuningFrequencyExtractor¶
Extracts the tuning frequency of an audio signal
Music Similarity¶
ChromaCrossSimilarity¶
Computes a binary cross similarity matrix from two chromagam feature vectors of a query and reference song
CoverSongSimilarity¶
Computes a cover song similiarity measure from a binary cross similarity matrix input between two chroma vectors of a query and reference song using various alignment constraints of smith-waterman local-alignment algorithm
CrossSimilarityMatrix¶
(standard)
Computes a euclidean cross-similarity matrix of two sequences of frame features
Fingerprinting¶
Chromaprinter¶
Computes the fingerprint of the input signal using Chromaprint algorithm
Audio Problems¶
ClickDetector¶
Detects the locations of impulsive noises (clicks and pops) on the input audio frame
DiscontinuityDetector¶
Uses LPC and some heuristics to detect discontinuities in an audio signal
FalseStereoDetector¶
Detects if a stereo track has duplicated channels (false stereo)
GapsDetector¶
Uses energy and time thresholds to detect gaps in the waveform
HumDetector¶
Detects low frequency tonal noises in the audio signal
NoiseBurstDetector¶
Detects noise bursts in the waveform by thresholding the peaks of the second derivative
SNR¶
Computes the SNR of the input audio in a frame-wise manner
SaturationDetector¶
This algorithm outputs the staring/ending locations of the saturated regions in seconds
StartStopCut¶
Outputs if there is a cut at the beginning or at the end of the audio by locating the first and last non-silent frames and comparing their positions to the actual beginning and end of the audio
TruePeakDetector¶
Implements a “true-peak” level meter for clipping detection
Duration/silence¶
Duration¶
Outputs the total duration of an audio signal
EffectiveDuration¶
Computes the effective duration of an envelope signal
FadeDetection¶
Detects fade-in and fade-outs time positions in an audio signal given a sequence of RMS values
SilenceRate¶
Estimates if a frame is silent
StartStopSilence¶
Outputs the frame at which sound begins and the frame at which sound ends
Loudness/dynamics¶
DynamicComplexity¶
Computes the dynamic complexity defined as the average absolute deviation from the global loudness level estimate on the dB scale
Intensity¶
(standard)
Classifies the input audio signal as either relaxed (-1), moderate (0), or aggressive (1)
Larm¶
Estimates the long-term loudness of an audio signal
Leq¶
Computes the Equivalent sound level (Leq) of an audio signal
LevelExtractor¶
Extracts the loudness of an audio signal in frames using Loudness algorithm
Loudness¶
Computes the loudness of an audio signal defined by Steven’s power law
LoudnessEBUR128¶
Computes the EBU R128 loudness descriptors of an audio signal
LoudnessEBUR128Filter¶
An auxilary signal preprocessing algorithm used within the LoudnessEBUR128 algorithm
LoudnessVickers¶
Computes Vickers’s loudness of an audio signal
ReplayGain¶
Computes the Replay Gain loudness value of an audio signal
Extractors¶
BarkExtractor¶
Extracts some Bark bands based spectral features from an audio signal
Extractor¶
(standard)
Extracts all low-level, mid-level and high-level features from an audio signal and stores them in a pool
FreesoundExtractor¶
(standard)
Is a wrapper for Freesound Extractor
LowLevelSpectralEqloudExtractor¶
Extracts a set of level spectral features for which it is recommended to apply a preliminary equal-loudness filter over an input audio signal (according to the internal evaluations conducted at Music Technology Group)
LowLevelSpectralExtractor¶
Extracts all low-level spectral features, which do not require an equal-loudness filter for their computation, from an audio signal
MusicExtractor¶
(standard)
Is a wrapper for Music Extractor
MusicExtractorSVM¶
(standard)
This algorithms computes SVM predictions given a pool with aggregated descriptor values computed by MusicExtractor (or FreesoundExtractor)
Transformations¶
GaiaTransform¶
(standard)
Applies a given Gaia2 transformation history to a given pool
PCA¶
(standard)
Applies Principal Component Analysis based on the covariance matrix of the signal
Synthesis¶
HarmonicMask¶
Applies a spectral mask to remove a pitched source component from the signal
HarmonicModelAnal¶
Computes the harmonic model analysis
HprModelAnal¶
Computes the harmonic plus residual model analysis
HpsModelAnal¶
Computes the harmonic plus stochastic model analysis
ResampleFFT¶
Resamples a sequence using FFT/IFFT
SineModelAnal¶
Computes the sine model analysis
SineModelSynth¶
Computes the sine model synthesis from sine model analysis
SineSubtraction¶
Subtracts the sinusoids computed with the sine model analysis from an input audio signal
SprModelAnal¶
Computes the sinusoidal plus residual model analysis
SprModelSynth¶
Computes the sinusoidal plus residual model synthesis from SPS model analysis
SpsModelAnal¶
Computes the stochastic model analysis
SpsModelSynth¶
Computes the sinusoidal plus stochastic model synthesis from SPS model analysis
StochasticModelAnal¶
Computes the stochastic model analysis
StochasticModelSynth¶
Computes the stochastic model synthesis
Pitch¶
MultiPitchKlapuri¶
(standard)
Estimates multiple pitch values corresponding to the melodic lines present in a polyphonic music signal (for example, string quartet, piano)
MultiPitchMelodia¶
Estimates multiple fundamental frequency contours from an audio signal
PitchCREPE¶
Estimates pitch of monophonic audio signals using CREPE models
PitchContourSegmentation¶
(standard)
Converts a pitch sequence estimated from an audio signal into a set of discrete note events
PitchContours¶
Tracks a set of predominant pitch contours of an audio signal
PitchContoursMelody¶
Converts a set of pitch contours into a sequence of predominant f0 values in Hz by taking the value of the most predominant contour in each frame
PitchContoursMonoMelody¶
Converts a set of pitch contours into a sequence of f0 values in Hz by taking the value of the most salient contour in each frame
PitchContoursMultiMelody¶
Post-processes a set of pitch contours into a sequence of mutliple f0 values in Hz
PitchFilter¶
Corrects the fundamental frequency estimations for a sequence of frames given pitch values together with their confidence values
PitchMelodia¶
Estimates the fundamental frequency corresponding to the melody of a monophonic music signal based on the MELODIA algorithm
PitchSalienceFunction¶
Computes the pitch salience function of a signal frame given its spectral peaks
PitchSalienceFunctionPeaks¶
Computes the peaks of a given pitch salience function
PitchYin¶
Estimates the fundamental frequency given the frame of a monophonic music signal
PitchYinFFT¶
Estimates the fundamental frequency given the spectrum of a monophonic music signal
PitchYinProbabilistic¶
Computes the pitch track of a mono audio signal using probabilistic Yin algorithm
PitchYinProbabilities¶
Estimates the fundamental frequencies, their probabilities given the frame of a monophonic music signal
PitchYinProbabilitiesHMM¶
Estimates the smoothed fundamental frequency given the pitch candidates and probabilities using hidden Markov models
PredominantPitchMelodia¶
Estimates the fundamental frequency of the predominant melody from polyphonic music signals using the MELODIA algorithm
Vibrato¶
Detects the presence of vibrato and estimates its parameters given a pitch contour [Hz]
Segmentation¶
SBic¶
Segments audio using the Bayesian Information Criterion given a matrix of frame features
Machine Learning¶
TensorflowPredict¶
Runs a Tensorflow graph and stores the desired output tensors in a pool
TensorflowPredict2D¶
Makes predictions using models expecting 2D representations
TensorflowPredictCREPE¶
Generates activations of monophonic audio signals using CREPE models
TensorflowPredictEffnetDiscogs¶
Makes predictions using EffnetDiscogs-based models
TensorflowPredictFSDSINet¶
Makes predictions using FSD-SINet models
TensorflowPredictMAEST¶
Makes predictions using MAEST-based models
TensorflowPredictMusiCNN¶
Makes predictions using MusiCNN-based models
TensorflowPredictTempoCNN¶
Makes predictions using TempoCNN-based models
TensorflowPredictVGGish¶
Makes predictions using VGGish-based models