Building an English speech synthetic voice using a voice transformation model from a Japanese male voice

J. Acoust. Soc. Am., Vol. 120, No. 5, Pt. 2, pp. 3036-3037, 2006

Building an English speech synthetic voice using a voice transformation model from a Japanese male voice

A. Iida, S. Kajima, K. Yasu, T. Arai and T. Sugawara

Abstract: This work reports development of an English speech synthetic voice using a voice transformation model for a Japanese amyotrophic lateral sclerosis patient as part of a project of developing a bilingual communication aid for this patient. The patient, who had a tracheotomy 3 years ago and had difficulty in speaking, wishes to speak in his own voice in his native language and in English. A Japanese speech synthesis system was developed using atr chatr 6 years ago and the authors have worked in developing a diphone-based synthesis using festival speech synthesis system and festvox by having the patient read the diphone list. However, it was not an easy task for the patient to phonate and, moreover, to pronounce words in a foreign language. We therefore used a voice transformation model in festival to develop the patient’s English speech synthetic voice which enables text-to-speech synthesis. We trained using 30 sentences read by the patient and those synthesized with an existing festival diphone voice created from a recording of a native English speaker. An evaluation including a listening experiment was conducted and the result of this voice conversion showed that the synthesized voice successfully reflected the voice quality of the patient.

[PDF (32 kB)]