Audio Spectrogram — NVIDIA DALI 1.16.0 Documentation
Calculating the Spectrogram using DALI#
To demonstrate DALI’s spectrogram operator we will define a DALI pipeline, whose input will be provided externally with the help of external_source operator. For demonstration purposes, we can just feed the same input in every iteration, as we will be only calculating one spectrogram.
[5]: fromnvidia.daliimport pipeline_def importnvidia.dali.fnasfn importnvidia.dali.typesastypes importnvidia.daliasdali audio_data = np.array(y, dtype=np.float32) @pipeline_def defspectrogram_pipe(nfft, window_length, window_step, device="cpu"): audio = types.Constant(device=device, value=audio_data) spectrogram = fn.spectrogram( audio, device=device, nfft=nfft, window_length=window_length, window_step=window_step, ) return spectrogramWith the pipeline defined, we can now just build it and run it
[6]: pipe = spectrogram_pipe( device="gpu", batch_size=1, num_threads=3, device_id=0, nfft=n_fft, window_length=n_fft, window_step=hop_length, ) pipe.build() outputs = pipe.run() spectrogram_dali = outputs[0][0].as_cpu()and display it as we did with the reference implementation
[7]: spectrogram_dali_db = librosa.power_to_db(spectrogram_dali, ref=np.max) show_spectrogram(spectrogram_dali_db, "DALI power spectrogram", sr, hop_length)
As a last sanity check, we can verify that the numerical difference between the reference implementation and DALI’s is insignificant
[8]: print( "Average error: {0:.5f} dB".format( np.mean(np.abs(spectrogram_dali_db - spectrogram_librosa_db)) ) ) assert np.allclose(spectrogram_dali_db, spectrogram_librosa_db, atol=2) Average error: 0.00491 dBTừ khóa » N_mfcc=40
-
Librosa.feature.mfcc — Librosa 0.10.v0 Documentation
-
Librosa.feature.mfcc — Librosa 0.7.2 Documentation
-
Explanation Of Librosa MFCC Parameter With N_mfcc=40
-
Some Question When Extracting MFCC Features · Issue #595 - GitHub
-
Librosa.feature.mfcc
-
Mel-frequency Cepstrum Coefficients — Transform_mfcc • Torchaudio
-
Why N_mfcc Value = 40 , While Extracting Mfcc Features | By Naveen K
-
Librosa.feature.mfcc Example - Program Talk
-
Continue With Machine Learning - Noise Detection (Classification)
-
Speech Recognition - Audio Data Analysis Post Doubts (MFCCS)
-
Class For Audio Feature Extraction - Notebooks
-
Speech Features - Jupyter Notebooks Gallery