Audio Spectrogram — NVIDIA DALI 1.16.0 Documentation

Có thể bạn quan tâm

Calculating the Spectrogram using DALI#

To demonstrate DALI’s spectrogram operator we will define a DALI pipeline, whose input will be provided externally with the help of external_source operator. For demonstration purposes, we can just feed the same input in every iteration, as we will be only calculating one spectrogram.

[5]: fromnvidia.daliimport pipeline_def importnvidia.dali.fnasfn importnvidia.dali.typesastypes importnvidia.daliasdali audio_data = np.array(y, dtype=np.float32) @pipeline_def defspectrogram_pipe(nfft, window_length, window_step, device="cpu"): audio = types.Constant(device=device, value=audio_data) spectrogram = fn.spectrogram( audio, device=device, nfft=nfft, window_length=window_length, window_step=window_step, ) return spectrogram

With the pipeline defined, we can now just build it and run it

[6]: pipe = spectrogram_pipe( device="gpu", batch_size=1, num_threads=3, device_id=0, nfft=n_fft, window_length=n_fft, window_step=hop_length, ) pipe.build() outputs = pipe.run() spectrogram_dali = outputs[0][0].as_cpu()

and display it as we did with the reference implementation

[7]: spectrogram_dali_db = librosa.power_to_db(spectrogram_dali, ref=np.max) show_spectrogram(spectrogram_dali_db, "DALI power spectrogram", sr, hop_length) ../../_images/examples_audio_processing_spectrogram_15_0.png

../../_images/examples_audio_processing_spectrogram_15_0.png

As a last sanity check, we can verify that the numerical difference between the reference implementation and DALI’s is insignificant

[8]: print( "Average error: {0:.5f} dB".format( np.mean(np.abs(spectrogram_dali_db - spectrogram_librosa_db)) ) ) assert np.allclose(spectrogram_dali_db, spectrogram_librosa_db, atol=2) Average error: 0.00491 dB

Từ khóa » N_mfcc=40

Audio Spectrogram — NVIDIA DALI 1.16.0 Documentation

Calculating the Spectrogram using DALI#

Librosa.feature.mfcc — Librosa 0.10.v0 Documentation

Librosa.feature.mfcc — Librosa 0.7.2 Documentation

Explanation Of Librosa MFCC Parameter With N_mfcc=40

Some Question When Extracting MFCC Features · Issue #595 - GitHub

Librosa.feature.mfcc

Mel-frequency Cepstrum Coefficients — Transform_mfcc • Torchaudio

Why N_mfcc Value = 40 , While Extracting Mfcc Features | By Naveen K

Librosa.feature.mfcc Example - Program Talk

Continue With Machine Learning - Noise Detection (Classification)

Speech Recognition - Audio Data Analysis Post Doubts (MFCCS)

Class For Audio Feature Extraction - Notebooks

Speech Features - Jupyter Notebooks Gallery

Liên Hệ