TensorflowPredictTempoCNN¶

streaming mode | Machine Learning category

Inputs¶

signal (real) - the input audio signal sampled at 11025 Hz

Outputs¶

predictions (vector_real) - the output values from the model node named after output

Parameters¶

batchSize (integer ∈ [-1, ∞), default = 64) :
number of patches to process in parallel. Use -1 or 0 to accumulate all the patches and run a single TensorFlow session at the end of the stream.

graphFilename (string, default = “”) :
the name of the file from which to load the TensorFlow graph

input (string, default = input) :
the name of the input node in the TensorFlow graph

lastPatchMode (string ∈ {discard, repeat}, default = discard) :
what to do with the last frames: repeat them to fill the last patch or discard them

output (string, default = output) :
the name of the node from which to retrieve the output tensors

patchHopSize (integer ∈ [0, ∞), default = 128) :
the number of frames between the beginnings of adjacent patches. 0 to avoid overlap

savedModel (string, default = “”) :
the name of the TensorFlow SavedModel. Overrides parameter graphFilename

Description¶

This algorithm makes predictions using TempoCNN-based models.

Internally, it uses TensorflowInputTempoCNN for the input feature extraction (mel bands). It feeds the model with patches of 256 mel bands frames and jumps a constant amount of frames determined by patchHopSize.

With the batchSize parameter set to -1 or 0 the patches are stored to run a single TensorFlow session at the end of the stream. This allows to take advantage of parallelization when GPUs are available, but at the same time it can be memory exhausting for long files.

The recommended pipeline is as follows:

MonoLoader(sampleRate=11025) >> TensorflowPredictTempoCNN

Note: This algorithm does not make any check on the input model so it is the user’s responsibility to make sure it is a valid one.

References:

Hendrik Schreiber, Meinard Müller, A Single-Step Approach to Musical Tempo Estimation Using a Convolutional Neural Network Proceedings of the 19th International Society for Music Information Retrieval Conference (ISMIR), Paris, France, Sept. 2018.
Hendrik Schreiber, Meinard Müller, Musical Tempo and Key Estimation using Convolutional Neural Networks with Directional Filters Proceedings of the Sound and Music Computing Conference (SMC), Málaga, Spain, 2019.
Original models and code at https://github.com/hendriks73/tempo-cnn
Supported models at https://essentia.upf.edu/models/

Source code¶

C++ source code

C++ header file