Investigations on information of speech recognition and speaker identification in modulation spectrum

Technical Report of IEICE Japan, Vol. SP2000-34, pp. 15-22, 2000 (in Japanese)

Investigations on information of speech recognition and speaker identification in modulation spectrum

N. Kanedera, T. Arai, M. Takahashi and T. Funada

Abstract: The Fourier transform of the time trajectories of a parameter such as logarithmic spectrum or cepstrum is called the modelation spectrum. In this paper we report on the important modulation frequency bands for speech recognition and speaker identification. The results by continuous speech recognition experiments and perceptual speaker-identification experiments show that the modulation frequency band between 2 and 8 Hz is important for both speech recognition and speaker identification. The results also suggest that the information for speaker identification lies in the range above 16 Hz, while information for the information for speech recognition does not lie in the range.

Keywords: modulation spectrum, modulation frequency, speech recognition, speaker identification

[PDF (549 kB)]