Domanda

Are there any open source, open content projects that use recorded speech data to generate synthesized speech? (With the goal of synthesising/simulating the speech of a particular individual. As a side note, is there a name for this process, goal or the data extracted? "voice signature"?)

I imagine the workflow would be something like:

  • record speech from standardized text ("The teddy sat on the mat.")
  • pick out phonemes ("a" of cat), accounting for accent
  • get the data that makes Alice's "eh" sound different to Betty's "eh"
  • render text to speech using accent-appropriate phonemes plus voice signature

Answering this question is a critical step in petitioning Jack Angel (Teddy, Wonkers) to donate his soothing voice signature to the public domain for the sake of humanity.

È stato utile?

Soluzione

Here is an open source project called festvox sponsored by Carnegie Mellon University that has a goal of synthesized voice built on a particular speaker. There concept is described here and it sounds like a very time consuming process to get it tuned correctly. There is a good list of Text-To-Speech open source projects on BableFish.org. There is a good discussion on Text To Speech Blog about building a TTS engine around a particular speaker.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top