Pergunta

Are there any open source, open content projects that use recorded speech data to generate synthesized speech? (With the goal of synthesising/simulating the speech of a particular individual. As a side note, is there a name for this process, goal or the data extracted? "voice signature"?)

I imagine the workflow would be something like:

  • record speech from standardized text ("The teddy sat on the mat.")
  • pick out phonemes ("a" of cat), accounting for accent
  • get the data that makes Alice's "eh" sound different to Betty's "eh"
  • render text to speech using accent-appropriate phonemes plus voice signature

Answering this question is a critical step in petitioning Jack Angel (Teddy, Wonkers) to donate his soothing voice signature to the public domain for the sake of humanity.

Foi útil?

Solução

Here is an open source project called festvox sponsored by Carnegie Mellon University that has a goal of synthesized voice built on a particular speaker. There concept is described here and it sounds like a very time consuming process to get it tuned correctly. There is a good list of Text-To-Speech open source projects on BableFish.org. There is a good discussion on Text To Speech Blog about building a TTS engine around a particular speaker.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top