> Speech

Formants

1. What is the formant? Formant is a characteristic component of the quality of a speech sound. It is the strongest part on the spectrogram where the botton one is the fundamental frequency F0. Pitch, actually, is a similar concept that reference https://www.zhihu.com/question/24190826/answer/280149476

Continue reading

MFCC

1. Mel Frequency Cepstrum Coefficient $$Mel(f)=2585*log_{10}(1+\frac{f}{700})$$ import numpy as np import matplotlib.pyplot as plt x = np.arange(8001) y = 2595 * np.log10(1+x/700) plt.plot(x, y, color='blue', linewidth=3) plt.xlabel("f", fontsize=17) plt.ylabel("Mel(f)", fontsize=17) plt.xlim(0,x[-1]) plt.ylim(0,y[-1]) plt.savefig('mel_f.png', dpi=500) 2. workflow import numpy as np import scipy.io.wavfile from matplotlib import pyplot as plt from scipy.fftpack import dct # 原始数据,读取前3.5s 的数据 sample_rate, signal = scipy.io.wavfile.read('OSR_us_000_0010_8k.wav') original_signal = signal[0:int(3.5*sample_rate)] signal_num = np.arange(len(signal)) sample_num = np.arange(len(original_signal)) # 绘图 01 plt.

Continue reading

Author's picture

LI WEI

苟日新,日日新,又日新

Not yet

Tokyo