Diphone synthesis system based on time-domain prosodic modifications of speech

Research output: Contribution to journalConference articlepeer-review

Abstract

A novel time-domain algorithm is presented for text-to-speech synthesis using diphone concatenation. The algorithm is based on the pitch-synchronous overlap-add (PSOLA) approach and is capable of good quality prosodic modifications of natural speech. The algorithm can be seen as a simplification of a previous algorithm combining the PSOLA approach and frequency-domain transformations. On the other hand, it appears as a generalization of previous time-domain methods that perform pitch synchronous cut-and-splice operations on the speech waveform. This algorithm is used in the CNET diphone synthesis multilingual system, actually supporting three languages: French, Italian, and German. The resulting speech has been tested on French and is judged of much better quality than for an LPC-based synthesizer.

Original languageEnglish
Pages (from-to)238-241
Number of pages4
JournalICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume1
Publication statusPublished - 1 Dec 1989
Externally publishedYes
Event1989 International Conference on Acoustics, Speech, and Signal Processing - Glasgow, Scotland
Duration: 23 May 198926 May 1989

Fingerprint

Dive into the research topics of 'Diphone synthesis system based on time-domain prosodic modifications of speech'. Together they form a unique fingerprint.

Cite this