문제

So this is from the late 90s ... http://www.cs.princeton.edu/~prc/SingingSynth.html

Why hasn't this taken off? (We can synthesize photorealistic like images, but the synthesis of singing ... still seems to be in very primitive stages).

What exactly is it that makes the synthesis of singing difficult?

http://www.interspeech2007.org/Technical/synthesis_of_singing_challenge.php <-- still seems primitive.

도움이 되었습니까?

해결책

My feeling is that we get into the uncanny valley for sounds easier than for images. While our brain accepts a badly formed image relatively well, it does not accept a badly formed sound unless it sounds natural. Everything that does not sound perfectly unperfect sounds creepy, and this makes a very strong barrier to actual applications. It is good for announcements and telephone services, but we are a long way from totally synthetic singing.

On the other hand, modification of actual voices is daily performed, both live and in studio. Without Autotune all the "gangsta" and "lady gagas" out there would do a job more suited to their actual talent.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top