PoserSpeak, New Voices and other thoughts...This topic was actually an email i got from a buyer, used here with his permission...I'm a developer for Chevron Phillips during the day hours and they get me a MSDN subscription every year for work. so Last year I decided to play around with Microsoft's Voice technology SDK. I was able to get the demos running and I could get the machine to say text from a text box. Not very impressive I know. I am however interested in doing what I can to get this type of lip sync up and going. I'm a programmer too and I know the feeling you get when a user request something completely out of the applications scope and you just have to agree with them on how good an idea it is and blah blah blah. So I'll admit that I'm not familiar enough with the voice technology to know whether or not I'm completely out of reach. I would love to help out if you could just point me in a direction. Here goes. I'm assuming that the better sounding voices are just collections of what you might call a sound font. These common sound parts are recorded and re-recorder with different inflections and then somehow mapped to basic rules of vocabulary so that the computer can essentially sound out a specific word. 1. Is there a tool that will allow me to record my own samples? 2. If so I could record sets off these "sound fonts", only I would record them multiple times based on mood. 3. This is where your software fits in... The text can be written like a marked up language. Sample: <speech actor="M_Father" mood="calm"> How was school today son? </speech> <speech actor="M_child" mood="sad"> Shucks, I don't know dad. </speech> <speech actor="M_Father" mood="curious">Well what's wrong? </speech> <speech actor="M_child" mood="sad">Well, I kind of got sent to the principals office. </speech> <speech actor="M_Father" mood="angry">what? That's the third time this week</speech> The intensity and variations in facial distortion as well as the sound samples would be determined by the mood attribute. The deformations etc are directed to the selected model in the scene using the actor attribute. Edit timing by adding <wait frames="120"/> I could go on and on about group conversations everything Right now I'm just curious if you know of a voice sample tool. The other stuff is probably off the wall. You have a wonderful package I guess I'm just excited so that why I'm coming up with all of this stuff. That's again. db
|