Some Question When Extracting MFCC Features · Issue #595 - GitHub

Có thể bạn quan tâm

Skip to content Dismiss alert {{ message }} librosa / librosa Public

Notifications You must be signed in to change notification settings
Fork 1k
Star 8.2k

Code
Issues 65
Pull requests 8
Actions
Security
Insights

Additional navigation options Some question when extracting MFCC features #595New issueNew issueClosedClosedSome question when extracting MFCC features#595Labelsquestion

Description

JinmingZhaoopened on Jul 5, 2017

The audio information: Input File : 'aa.wav' Channels : 1 Sample Rate : 16000 Precision : 16-bit Duration : 00:00:00.64 = 10160 samples ~ 47.625 CDDA sectors File Size : 20.4k Bit Rate : 257k Sample Encoding: 16-bit Signed Integer PCM

when use the "frame length=25ms, frame shift=10ms" , number of frames should be (10160-240)/160=62frames. and get 62 frame in kaldi.

But use librosa to extract the MFCC features, I got 64 frames:

sr = 16000 n_mfcc = 13 n_mels = 40 n_fft = 512 win_length = 400 # 0.025*16000 hop_length = 160 # 0.010 * 16000 window = 'hamming' fmin = 20 fmax = 4000 y, sr = librosa.load(wav_file, sr=16000) print(sr) D = numpy.abs(librosa.stft(y, window=window, n_fft=n_fft, win_length=win_length, hop_length=hop_length))**2 S = feature.melspectrogram(S=D, y=y, n_mels=n_mels, fmin=fmin, fmax=fmax) feats = feature.mfcc(S=librosa.power_to_db(S), n_mfcc=n_mfcc) print(feats.shape)

feats = feature.mfcc(y=y, n_mfcc=n_mfcc, n_fft=n_fft, n_mels=n_mels, fmin=fmin, hop_length=hop_length)

all of the two librosa code will result in (13,64) shape.

Another question, in the feature.mfcc() function: Could I directly given the window_length, window, hop_length parameters?

Look forward to your reply.

Thanks Jinming

Metadata

Assignees

No one assigned

Labels

question

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions

You can’t perform that action at this time.

Từ khóa » N_mfcc=40

Some Question When Extracting MFCC Features · Issue #595 - GitHub

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Librosa.feature.mfcc — Librosa 0.10.v0 Documentation

Librosa.feature.mfcc — Librosa 0.7.2 Documentation

Explanation Of Librosa MFCC Parameter With N_mfcc=40

Librosa.feature.mfcc

Mel-frequency Cepstrum Coefficients — Transform_mfcc • Torchaudio

Why N_mfcc Value = 40 , While Extracting Mfcc Features | By Naveen K

Librosa.feature.mfcc Example - Program Talk

Continue With Machine Learning - Noise Detection (Classification)

Speech Recognition - Audio Data Analysis Post Doubts (MFCCS)

Audio Spectrogram — NVIDIA DALI 1.16.0 Documentation

Class For Audio Feature Extraction - Notebooks

Speech Features - Jupyter Notebooks Gallery

Liên Hệ