TempoCNN

streaming mode | Rhythm category

Inputs

  • audio (vector_real) - the input audio signal sampled at 11025 Hz

Outputs

  • globalTempo (real) - the overall tempo estimation in BPM
  • localTempo (vector_real) - the patch-wise tempo estimations in BPM
  • localTempoProbabilities (vector_real) - the patch-wise tempo probabilities

Parameters

  • aggregationMethod (string ∈ {majority, mean, median}, default = majority) :
    method used to estimate the global tempo.
  • batchSize (integer ∈ [-1, ∞), default = 64) :
    number of patches to process in parallel. Use -1 or 0 to accumulate all the patches and run a single TensorFlow session at the end of the stream.
  • graphFilename (string, default = "") :
    the name of the file from which to load the TensorFlow graph
  • input (string, default = input) :
    the name of the input node in the TensorFlow graph
  • lastPatchMode (string ∈ {discard, repeat}, default = discard) :
    what to do with the last frames: repeat them to fill the last patch or discard them
  • output (string, default = output) :
    the name of the node from which to retrieve the tempo bins activations
  • patchHopSize (integer ∈ [0, ∞), default = 128) :
    the number of frames between the beginnings of adjacent patches. 0 to avoid overlap
  • savedModel (string, default = "") :
    the name of the TensorFlow SavedModel. Overrides parameter graphFilename

Description

This algorithm estimates tempo using TempoCNN-based models.

Internally, this algorithm is a wrapper to aggregate the predictions generated by TensorflowPredictTempoCNN. localTempo is a vector containing the most likely BPM estimated each ~6 seconds by default. localTempoProbabilities contains the probabilities attached to the tempo estimations and can be used as a confidence measure. globalTempo is an aggregation of localTempo using an aggregationMethod. We strongly recommend to use majority voting when assuming constant tempo in the input audio.

See TensorflowPredictTempoCNN for details about the rest of parameters. The recommended pipeline is as follows:

MonoLoader(sampleRate=11025) >> TempoCNN

Note: This algorithm does not make any check on the input model so it is the user's responsibility to make sure it is a valid one.

References:

  1. Hendrik Schreiber, Meinard Müller, A Single-Step Approach to Musical Tempo Estimation Using a Convolutional Neural Network Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR), Paris, France, Sept. 2018.
  2. Hendrik Schreiber, Meinard Müller, Musical Tempo and Key Estimation using Convolutional Neural Networks with Directional Filters Proceedings of the Sound and Music Computing Conference (SMC), Málaga, Spain, 2019.
  3. Original models and code at https://github.com/hendriks73/tempo-cnn
  4. Supported models at https://essentia.upf.edu/models/

See also

Key (standard) Key (streaming) MonoLoader (standard) MonoLoader (streaming) TempoCNN (standard) TensorflowPredict (standard) TensorflowPredict (streaming) TensorflowPredictTempoCNN (standard) TensorflowPredictTempoCNN (streaming)

Streaming algorithms

AfterMaxToBeforeMaxEnergyRatio | AllPass | AudioLoader | AudioOnsetsMarker | AudioWriter | AutoCorrelation | BFCC | BPF | BandPass | BandReject | BarkBands | BarkExtractor | BeatTrackerDegara | BeatTrackerMultiFeature | Beatogram | BeatsLoudness | BinaryOperator | BinaryOperatorStream | BpmHistogram | BpmHistogramDescriptors | BpmRubato | CartesianToPolar | CentralMoments | Centroid | ChordsDescriptors | ChordsDetection | ChromaCrossSimilarity | Chromagram | Chromaprinter | ClickDetector | Clipper | ConstantQ | CoverSongSimilarity | Crest | CrossCorrelation | CubicSpline | DCRemoval | DCT | Danceability | Decrease | Derivative | DerivativeSFX | DiscontinuityDetector | Dissonance | DistributionShape | Duration | DynamicComplexity | ERBBands | EasyLoader | EffectiveDuration | Energy | EnergyBand | EnergyBandRatio | Entropy | Envelope | EqloudLoader | EqualLoudness | FFT | FFTC | FadeDetection | FalseStereoDetector | FileOutput | Flatness | FlatnessDB | FlatnessSFX | Flux | FrameCutter | FrameToReal | FrequencyBands | GFCC | GapsDetector | GeometricMean | HFC | HPCP | HarmonicBpm | HarmonicMask | HarmonicModelAnal | HarmonicPeaks | HighPass | HighResolutionFeatures | Histogram | HprModelAnal | HpsModelAnal | HumDetector | IDCT | IFFT | IFFTC | IIR | Inharmonicity | InstantPower | Key | KeyExtractor | LPC | Larm | Leq | LevelExtractor | LogAttackTime | LogSpectrum | LoopBpmConfidence | LoopBpmEstimator | Loudness | LoudnessEBUR128 | LoudnessEBUR128Filter | LoudnessVickers | LowLevelSpectralEqloudExtractor | LowLevelSpectralExtractor | LowPass | MFCC | Magnitude | MaxFilter | MaxMagFreq | MaxToTotal | Mean | Median | MedianFilter | MelBands | MetadataReader | Meter | MinMax | MinToTotal | MonoLoader | MonoMixer | MonoWriter | MovingAverage | MultiPitchMelodia | Multiplexer | NNLSChroma | NSGConstantQ | NSGConstantQStreaming | NSGIConstantQ | NoiseAdder | NoiseBurstDetector | NoveltyCurve | OddToEvenHarmonicEnergyRatio | OnsetDetection | OnsetDetectionGlobal | OnsetRate | Onsets | OverlapAdd | Panning | PeakDetection | PercivalBpmEstimator | PercivalEnhanceHarmonics | PercivalEvaluatePulseTrains | PitchCREPE | PitchContours | PitchContoursMelody | PitchContoursMonoMelody | PitchContoursMultiMelody | PitchFilter | PitchMelodia | PitchSalience | PitchSalienceFunction | PitchSalienceFunctionPeaks | PitchYin | PitchYinFFT | PitchYinProbabilistic | PitchYinProbabilities | PitchYinProbabilitiesHMM | PolarToCartesian | PoolAggregator | PoolToTensor | PowerMean | PowerSpectrum | PredominantPitchMelodia | RMS | RawMoments | RealAccumulator | ReplayGain | Resample | ResampleFFT | RhythmDescriptors | RhythmExtractor | RhythmExtractor2013 | RhythmTransform | RollOff | SBic | SNR | SaturationDetector | Scale | SilenceRate | SineModelAnal | SineModelSynth | SineSubtraction | SingleBeatLoudness | SingleGaussian | Slicer | SpectralCentroidTime | SpectralComplexity | SpectralContrast | SpectralPeaks | SpectralWhitening | Spectrum | SpectrumCQ | SpectrumToCent | Spline | SprModelAnal | SprModelSynth | SpsModelAnal | SpsModelSynth | StartStopCut | StartStopSilence | StereoDemuxer | StereoMuxer | StereoTrimmer | StochasticModelAnal | StochasticModelSynth | StrongDecay | StrongPeak | SuperFluxExtractor | SuperFluxNovelty | SuperFluxPeaks | TCToTotal | TempoCNN | TempoScaleBands | TempoTap | TempoTapDegara | TempoTapMaxAgreement | TempoTapTicks | TensorNormalize | TensorToPool | TensorToVectorReal | TensorTranspose | TensorflowInputFSDSINet | TensorflowInputMusiCNN | TensorflowInputTempoCNN | TensorflowInputVGGish | TensorflowPredict | TensorflowPredict2D | TensorflowPredictCREPE | TensorflowPredictEffnetDiscogs | TensorflowPredictFSDSINet | TensorflowPredictMAEST | TensorflowPredictMusiCNN | TensorflowPredictTempoCNN | TensorflowPredictVGGish | TonalExtractor | TriangularBands | TriangularBarkBands | Trimmer | Tristimulus | TruePeakDetector | TuningFrequency | TuningFrequencyExtractor | UnaryOperator | UnaryOperatorStream | Variance | VectorInput | VectorRealAccumulator | VectorRealToTensor | Vibrato | Viterbi | WarpedAutoCorrelation | Welch | Windowing | ZeroCrossingRate