OnsetDetection

streaming mode | Rhythm category

Inputs

  • spectrum (vector_real) - the input spectrum
  • phase (vector_real) - the phase vector corresponding to this spectrum (used only by the "complex" method)

Outputs

  • onsetDetection (real) - the value of the detection function in the current frame

Parameters

  • method (string ∈ {hfc, complex, complex_phase, flux, melflux, rms}, default = hfc) :
    the method used for onset detection
  • sampleRate (real ∈ (0, ∞), default = 44100) :
    the sampling rate of the audio signal [Hz]

Description

This algorithm computes various onset detection functions. The output of this algorithm should be post-processed in order to determine whether the frame contains an onset or not. Namely, it could be fed to the Onsets algorithm. It is recommended that the input "spectrum" is generated by the Spectrum algorithm. Four methods are available:

  • 'HFC', the High Frequency Content detection function which accurately detects percussive events (see HFC algorithm for details).
  • 'complex', the Complex-Domain spectral difference function [1] taking into account changes in magnitude and phase. It emphasizes note onsets either as a result of significant change in energy in the magnitude spectrum, and/or a deviation from the expected phase values in the phase spectrum, caused by a change in pitch.
  • 'complex_phase', the simplified Complex-Domain spectral difference function [2] taking into account phase changes, weighted by magnitude. TODO:It reacts better on tonal sounds such as bowed string, but tends to over-detect percussive events.
  • 'flux', the Spectral Flux detection function which characterizes changes in magnitude spectrum. See Flux algorithm for details.
  • 'melflux', the spectral difference function, similar to spectral flux, but using half-rectified energy changes in Mel-frequency bands of the spectrum [3].
  • 'rms', the difference function, measuring the half-rectified change of the RMS of the magnitude spectrum (i.e., measuring overall energy flux) [4].

If using the 'HFC' detection function, make sure to adhere to HFC's input requirements when providing an input spectrum. Input vectors of different size or empty input spectra will raise exceptions. If using the 'complex' detection function, suggested parameters for computation of "spectrum" and "phase" are 44100Hz sample rate, frame size of 1024 and hopSize of 512 samples, which results in a resolution of 11.6ms, and a Hann window.

References:

[1] Bello, Juan P., Chris Duxbury, Mike Davies, and Mark Sandler, On the use of phase and energy for musical onset detection in the complex domain, Signal Processing Letters, IEEE 11, no. 6 (2004): 553-556.

[2] P. Brossier, J. P. Bello, and M. D. Plumbley, "Fast labelling of notes in music signals," in International Symposium on Music Information Retrieval (ISMIR’04), 2004, pp. 331–336.

[3] D. P. W. Ellis, "Beat Tracking by Dynamic Programming," Journal of New Music Research, vol. 36, no. 1, pp. 51–60, 2007.

[4] J. Laroche, "Efficient Tempo and Beat Tracking in Audio Recordings," JAES, vol. 51, no. 4, pp. 226–233, 2003.

See also

Flux (standard) Flux (streaming) HFC (standard) HFC (streaming) OnsetDetection (standard) Onsets (standard) Onsets (streaming) RMS (standard) RMS (streaming) Spectrum (standard) Spectrum (streaming)

Streaming algorithms

AfterMaxToBeforeMaxEnergyRatio | AllPass | AudioLoader | AudioOnsetsMarker | AudioWriter | AutoCorrelation | BFCC | BPF | BandPass | BandReject | BarkBands | BarkExtractor | BeatTrackerDegara | BeatTrackerMultiFeature | Beatogram | BeatsLoudness | BinaryOperator | BinaryOperatorStream | BpmHistogram | BpmHistogramDescriptors | BpmRubato | CartesianToPolar | CentralMoments | Centroid | ChordsDescriptors | ChordsDetection | ChromaCrossSimilarity | Chromagram | Chromaprinter | ClickDetector | Clipper | ConstantQ | CoverSongSimilarity | Crest | CrossCorrelation | CubicSpline | DCRemoval | DCT | Danceability | Decrease | Derivative | DerivativeSFX | DiscontinuityDetector | Dissonance | DistributionShape | Duration | DynamicComplexity | ERBBands | EasyLoader | EffectiveDuration | Energy | EnergyBand | EnergyBandRatio | Entropy | Envelope | EqloudLoader | EqualLoudness | FFT | FFTC | FadeDetection | FalseStereoDetector | FileOutput | Flatness | FlatnessDB | FlatnessSFX | Flux | FrameCutter | FrameToReal | FrequencyBands | GFCC | GapsDetector | GeometricMean | HFC | HPCP | HarmonicBpm | HarmonicMask | HarmonicModelAnal | HarmonicPeaks | HighPass | HighResolutionFeatures | Histogram | HprModelAnal | HpsModelAnal | HumDetector | IDCT | IFFT | IFFTC | IIR | Inharmonicity | InstantPower | Key | KeyExtractor | LPC | Larm | Leq | LevelExtractor | LogAttackTime | LogSpectrum | LoopBpmConfidence | LoopBpmEstimator | Loudness | LoudnessEBUR128 | LoudnessEBUR128Filter | LoudnessVickers | LowLevelSpectralEqloudExtractor | LowLevelSpectralExtractor | LowPass | MFCC | Magnitude | MaxFilter | MaxMagFreq | MaxToTotal | Mean | Median | MedianFilter | MelBands | MetadataReader | Meter | MinMax | MinToTotal | MonoLoader | MonoMixer | MonoWriter | MovingAverage | MultiPitchMelodia | Multiplexer | NNLSChroma | NSGConstantQ | NSGConstantQStreaming | NSGIConstantQ | NoiseAdder | NoiseBurstDetector | NoveltyCurve | OddToEvenHarmonicEnergyRatio | OnsetDetection | OnsetDetectionGlobal | OnsetRate | Onsets | OverlapAdd | Panning | PeakDetection | PercivalBpmEstimator | PercivalEnhanceHarmonics | PercivalEvaluatePulseTrains | PitchCREPE | PitchContours | PitchContoursMelody | PitchContoursMonoMelody | PitchContoursMultiMelody | PitchFilter | PitchMelodia | PitchSalience | PitchSalienceFunction | PitchSalienceFunctionPeaks | PitchYin | PitchYinFFT | PitchYinProbabilistic | PitchYinProbabilities | PitchYinProbabilitiesHMM | PolarToCartesian | PoolAggregator | PoolToTensor | PowerMean | PowerSpectrum | PredominantPitchMelodia | RMS | RawMoments | RealAccumulator | ReplayGain | Resample | ResampleFFT | RhythmDescriptors | RhythmExtractor | RhythmExtractor2013 | RhythmTransform | RollOff | SBic | SNR | SaturationDetector | Scale | SilenceRate | SineModelAnal | SineModelSynth | SineSubtraction | SingleBeatLoudness | SingleGaussian | Slicer | SpectralCentroidTime | SpectralComplexity | SpectralContrast | SpectralPeaks | SpectralWhitening | Spectrum | SpectrumCQ | SpectrumToCent | Spline | SprModelAnal | SprModelSynth | SpsModelAnal | SpsModelSynth | StartStopCut | StartStopSilence | StereoDemuxer | StereoMuxer | StereoTrimmer | StochasticModelAnal | StochasticModelSynth | StrongDecay | StrongPeak | SuperFluxExtractor | SuperFluxNovelty | SuperFluxPeaks | TCToTotal | TempoCNN | TempoScaleBands | TempoTap | TempoTapDegara | TempoTapMaxAgreement | TempoTapTicks | TensorNormalize | TensorToPool | TensorToVectorReal | TensorTranspose | TensorflowInputFSDSINet | TensorflowInputMusiCNN | TensorflowInputTempoCNN | TensorflowInputVGGish | TensorflowPredict | TensorflowPredict2D | TensorflowPredictCREPE | TensorflowPredictEffnetDiscogs | TensorflowPredictFSDSINet | TensorflowPredictMAEST | TensorflowPredictMusiCNN | TensorflowPredictTempoCNN | TensorflowPredictVGGish | TonalExtractor | TriangularBands | TriangularBarkBands | Trimmer | Tristimulus | TruePeakDetector | TuningFrequency | TuningFrequencyExtractor | UnaryOperator | UnaryOperatorStream | Variance | VectorInput | VectorRealAccumulator | VectorRealToTensor | Vibrato | Viterbi | WarpedAutoCorrelation | Welch | Windowing | ZeroCrossingRate