EPAComp: An Architectural Model for EPA Composition
Luís Henrique Neves Villaça, Sean Wolfgand Matsui Siqueira, et al.
SBSI 2023
This paper proposes a new text-to-speech synthesis technique, for producing continuous, natural sounding speech of a specific speaker. The synthesis technique is based on selecting short speech frames from a phoneme-labeled speech database. The selection procedure involves minimization of a distortion criterion, by a dynamic programming algorithm. The proposed scheme is more flexible than many existing schemes using fixed speech segments, such as diphones. It results in a more natural synthesized speech. An efficient speech representation is used to express simply and accurately the spectral continuity of speech. A further improvement in the database search mechanism and in database size was obtained by sectioning the speech phonemes into "steady-states"and "transitions". The resulting synthesized speech quality, is satisfactory and indeed preserves the natural voice of the speaker.
Luís Henrique Neves Villaça, Sean Wolfgand Matsui Siqueira, et al.
SBSI 2023
M. Abe, M. Hori
SAINT 2003
Xiaodan Song, Ching-Yung Lin, et al.
CVPRW 2004
Yang Wang, Zicheng Liu, et al.
CVPR 2007