NNLSChroma

streaming mode | Tonal category

Inputs

  • logSpectrogram (vector_vector_real) - log spectrum frames
  • meanTuning (vector_real) - mean tuning frames
  • localTuning (vector_real) - local tuning frames

Outputs

  • tunedLogfreqSpectrum (vector_vector_real) - Log frequency spectrum after tuning
  • semitoneSpectrum (vector_vector_real) - a spectral representation with one bin per semitone
  • bassChromagram (vector_vector_real) - a 12-dimensional chromagram, restricted to the bass range
  • chromagram (vector_vector_real) - a 12-dimensional chromagram, restricted with mid-range emphasis

Parameters

  • chromaNormalization (string ∈ {none, maximum, L1, L2}, default = none) :
    determines whether or how the chromagrams are normalised
  • frameSize (integer ∈ (1, ∞), default = 1025) :
    the input frame size of the spectrum vector
  • sampleRate (real ∈ (0, ∞), default = 44100) :
    the input sample rate
  • spectralShape (real ∈ (0.5, 0.9), default = 0.7) :
    the shape of the notes in the NNLS dictionary
  • spectralWhitening (real ∈ [0, 1.0], default = 1) :
    determines how much the log-frequency spectrum is whitened
  • tuningMode (string ∈ {global, local}, default = global) :
    local uses a local average for tuning, global uses all audio frames. Local tuning is only advisable when the tuning is likely to change over the audio
  • useNNLS (bool ∈ {true, false}, default = true) :
    toggle between NNLS approximate transcription and linear spectral mapping

Description

This algorithm extracts treble and bass chromagrams from a sequence of log-frequency spectrum frames. On this representation, two processing steps are performed:

-tuning, after which each centre bin (i.e. bin 2, 5, 8, ...) corresponds to a semitone, even if the tuning of the piece deviates from 440 Hz standard pitch. -running standardisation: subtraction of the running mean, division by the running standard deviation. This has a spectral whitening effect.

This code is ported from NNLS Chroma [1, 2]. To achieve similar results follow this processing chain: frame slicing with sample rate = 44100, frame size = 16384, hop size = 2048 -> Windowing with Hann and no normalization -> Spectrum -> LogSpectrum.

References:
[1] Mauch, M., & Dixon, S. (2010, August). Approximate Note Transcription for the Improved Identification of Difficult Chords. In ISMIR (pp. 135-140). [2] Chordino and NNLS Chroma, http://www.isophonics.net/nnls-chroma

Streaming algorithms

AfterMaxToBeforeMaxEnergyRatio | AllPass | AudioLoader | AudioOnsetsMarker | AudioWriter | AutoCorrelation | BFCC | BPF | BandPass | BandReject | BarkBands | BarkExtractor | BeatTrackerDegara | BeatTrackerMultiFeature | Beatogram | BeatsLoudness | BinaryOperator | BinaryOperatorStream | BpmHistogram | BpmHistogramDescriptors | BpmRubato | CartesianToPolar | CentralMoments | Centroid | ChordsDescriptors | ChordsDetection | Chromagram | Chromaprinter | Clipper | ConstantQ | Crest | CrossCorrelation | CubicSpline | DCRemoval | DCT | Danceability | Decrease | Derivative | DerivativeSFX | Dissonance | DistributionShape | Duration | DynamicComplexity | ERBBands | EasyLoader | EffectiveDuration | Energy | EnergyBand | EnergyBandRatio | Entropy | Envelope | EqloudLoader | EqualLoudness | FFT | FFTC | FadeDetection | FileOutput | Flatness | FlatnessDB | FlatnessSFX | Flux | FrameCutter | FrameToReal | FrequencyBands | GFCC | GeometricMean | HFC | HPCP | HarmonicBpm | HarmonicMask | HarmonicModelAnal | HarmonicPeaks | HighPass | HighResolutionFeatures | Histogram | HprModelAnal | HpsModelAnal | IDCT | IFFT | IFFTC | IIR | Inharmonicity | InstantPower | Key | KeyExtractor | LPC | Larm | Leq | LevelExtractor | LogAttackTime | LogSpectrum | LoopBpmConfidence | LoopBpmEstimator | Loudness | LoudnessEBUR128 | LoudnessEBUR128Filter | LoudnessVickers | LowLevelSpectralEqloudExtractor | LowLevelSpectralExtractor | LowPass | MFCC | Magnitude | MaxFilter | MaxMagFreq | MaxToTotal | Mean | Median | MelBands | MetadataReader | Meter | MinToTotal | MonoLoader | MonoMixer | MonoWriter | MovingAverage | MultiPitchMelodia | Multiplexer | NNLSChroma | NSGConstantQ | NSGConstantQStreaming | NSGIConstantQ | NoiseAdder | NoveltyCurve | OddToEvenHarmonicEnergyRatio | OnsetDetection | OnsetDetectionGlobal | OnsetRate | Onsets | OverlapAdd | Panning | PeakDetection | PercivalBpmEstimator | PercivalEnhanceHarmonics | PercivalEvaluatePulseTrains | PitchContours | PitchContoursMelody | PitchContoursMonoMelody | PitchContoursMultiMelody | PitchFilter | PitchMelodia | PitchSalience | PitchSalienceFunction | PitchSalienceFunctionPeaks | PitchYin | PitchYinFFT | PitchYinProbabilistic | PitchYinProbabilities | PitchYinProbabilitiesHMM | PolarToCartesian | PoolAggregator | PowerMean | PowerSpectrum | PredominantPitchMelodia | RMS | RawMoments | RealAccumulator | ReplayGain | Resample | ResampleFFT | RhythmDescriptors | RhythmExtractor | RhythmExtractor2013 | RhythmTransform | RollOff | SBic | Scale | SilenceRate | SineModelAnal | SineModelSynth | SineSubtraction | SingleBeatLoudness | SingleGaussian | Slicer | SpectralCentroidTime | SpectralComplexity | SpectralContrast | SpectralPeaks | SpectralWhitening | Spectrum | SpectrumCQ | SpectrumToCent | Spline | SprModelAnal | SprModelSynth | SpsModelAnal | SpsModelSynth | StartStopSilence | StereoDemuxer | StereoMuxer | StereoTrimmer | StochasticModelAnal | StochasticModelSynth | StrongDecay | StrongPeak | SuperFluxExtractor | SuperFluxNovelty | SuperFluxPeaks | TCToTotal | TempoScaleBands | TempoTap | TempoTapDegara | TempoTapMaxAgreement | TempoTapTicks | TonalExtractor | TriangularBands | TriangularBarkBands | Trimmer | Tristimulus | TuningFrequency | TuningFrequencyExtractor | UnaryOperator | UnaryOperatorStream | Variance | VectorInput | VectorRealAccumulator | Vibrato | Viterbi | WarpedAutoCorrelation | Windowing | ZeroCrossingRate