PitchCREPE¶
standard mode | Pitch category
Inputs¶
audio
(vector_real) - the input audio signal sampled at 16000 Hz
Outputs¶
time
(vector_real) - the timestamps on which the pitch was estimated
frequency
(vector_real) - the predicted pitch values in Hz
confidence
(vector_real) - the confidence of voice activity, between 0 and 1
activations
(vector_vector_real) - the raw activation matrix
Parameters¶
batchSize
(integer ∈ [-1, ∞), default = 64) :the batch size for prediction. This allows parallelization when a GPU are available. Set it to -1 or 0 to accumulate all the patches and run a single TensorFlow session at the end
graphFilename
(string, default = “”) :the name of the file from which to load the TensorFlow graph
hopSize
(real ∈ (0, ∞), default = 10) :the hop size in milliseconds for running pitch estimation
input
(string, default = frames) :the name of the input node in the TensorFlow graph
output
(string, default = model/classifier/Sigmoid) :the name of the node from which to retrieve the output tensors
savedModel
(string, default = “”) :the name of the TensorFlow SavedModel. Overrides parameter graphFilename
Description¶
This algorithm estimates pitch of monophonic audio signals using CREPE models.
This algorithm is a wrapper to post-process the activations generated by TensorflowPredictCREPE. time contains the timestamps in which the pitch was estimated. frequency is the vector of pitch estimations in Hz. confidence expresses the confidence in the presence of pitch for each timestamp as value between 0 to 1. activations is a time by sigmoid activations matrix returned by the neural network.
See TensorflowPredictCREPE for details about the rest of parameters. The recommended pipeline is as follows:
MonoLoader(sampleRate=16000) >> PitchCREPE()
Notes: This algorithm does not make any check on the input model so it is the user’s responsibility to make sure it is a valid one. The required sample rate of input signal is 16 KHz. Other sample rates will lead to an incorrect behavior.
References:
CREPE: A Convolutional Representation for Pitch Estimation. Jong Wook Kim, Justin Salamon, Peter Li, Juan Pablo Bello. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2018.
Original models and code at https://github.com/marl/crepe/
Supported models at https://essentia.upf.edu/models/
Source code¶
See also¶
MonoLoader (standard) MonoLoader (streaming) PitchCREPE (streaming) TensorflowPredict (standard) TensorflowPredict (streaming) TensorflowPredictCREPE (standard) TensorflowPredictCREPE (streaming)