Cutting edge speech tech from IBM may allow Pixar to fire all the voice actors

http://www.technovelgy.com/ct/Science-Fiction-News.asp?NewsNum=2125

You read that right, soon computer speech may sound no different from regular speech by an actual person, Pixar and the other CGI studios will be to save money if they use this to no longer need voice actors. In a more realistic scenario it could be a boon to independent CGI short hobbyists who can’t afford real actors.

Read the link and read it for yourself.

I think the software would cost more than getting a few mates to do voices.

That said, this is damn cool. Will be pirating.

I believe that this is the relevent article not the one you have posted:

It does sound interesting, which the point i guess.

They don’t just use voice actors for their voices.
They also use their star-status to jack up ticket sales. Do you think sales of the next Pixar movie would go up if it said “with the synthetic voices of IBM’s voice software?” instead of “Tom Hanks”?.

I don’t think that ANY software can truly replicate the nuances of the human voice, and that there will always be something lacking, much like synthesizer instruments… (Of course, this is coming from a part time voice actor :D) And yes, I’d rather listen to Tom Hanks than a computer.

Random thought: Some of the most famous computers (HAL9000 from 2001: A Space Odyssey, WOPR/“Joshua” from WarGames, and KITT from Knight Rider ) in film and television were voiced by human actors (Douglas Rain as HAL9000, James Ackerman as “Joshua” and William Daniels as KITT). :smiley:

oh, please, we all know, IBM only develops stuff for their own fun these day’s unless someone like apple/sony/toshiba etc. actually goes to them for some type of a business alliance. :frowning:

jk

Though I sure would love to see IBM back in the x86 market. :slight_smile:

ehh… I can dream, can’t I. :frowning:

I don’t think so. From the article, it sounds like it’s geared towards call centres and SatNav.

If it was used for acting, it’s emotional range would be limited to what the programmers could put into it. While I’m sure the programmers are stable, emotionally whole, self actualised people (ooh, my irony gland is bulging) who are well in touch with their emotions, it may be another thing to have all that converted to code. After all, voice actors don’t just read their lines, they ACT out the characters fears, hopes and intentions. Still, it’s an interesting development.

I just got back from LA, sitting in a recording booth for two weeks. In one stint, I recorded the words “Captain Knowledge” about twenty times, and each was different, with different intonations, to indicate whether it was happy, sad, scary, at the beginning of the sentence, at the end, or in the middle, or was at the end of a question, or … So, a convincing voice, I think, falls into the Turing machine test.

CD, I don’t think it will be either free or cheap, so it wont help the indie. Besides, it is so much more fun to work with friends, don’t you think?

Agreed 100%… another thought I had was this: if they had used such a technology (had it existed at the time) in 2001, would it have been as effective? Think about the ominous, almost maniacal edge to “Hal’s” voice in his last exchange with Dave. Would a computer have been able to reproduce that?

Another fun fact: In WarGames, the director had James Ackerman (“Joshua”) record his lines backward, and then played them forward on the audio track to get that stilted, mechanical effect. Sometimes, I think that no software, however sophisticated, is any substitute for good old human ingenuity. :smiley:

You read that right, soon computer speech may sound no different from regular speech by an actual person, Pixar and the other CGI studios will be to save money if they use this to no longer need voice actors.

Bollocks. The technology may very well be marvelous, but to surmise that it could replace real voice actors for film is more than just a stretch; it’s a joke.

I’ll concede the possibility that this might happen when cameras are replaced by unbiased renderers.

It’s rumored that Pixar will also soon be switching to a computerized story generator and an automated modeling technique. :-/ Though the visuals are generated digitally, there is a strong human element in Pixar’s films, which is what makes them so good. You can’t begin systematically eliminating the human elements of the art. Voice actors (good ones) are very talented people who have a skill which a computer couldn’t possibly replicate any more than a computer could have designed Wall-E’s emotive facial features. This technology will be great for making computers more personable (especially if speech recognition and language parsing rise to the challenge) but it is by no means a replacement for actors.

Could be good for satnavs and telemarketing voices.

So let me ask the dumb question of the day… “Where can I hear this voice?”

Cause… the article, well… who cares… I want to hear it! hehe

Interesting, looking forward to an open source implementation :slight_smile:

it incorperates speach stalls such as ahh , err, um, anything a speach class trys to eliminate and thats the big thing? it’s not going to work, people wont accept it. people would rather know they are talking to a conputer that have a computer try to trick them. google ms agents. this has been tried years ago. seriously if someone on the phone tells you, err, ummm, ahhh press one for yes, and ummm, press 2 for no, you are going to press 0 since it is usually the live operator option and ask what you press to get your money and time back.

I don’t know if this is using the “new / latest greatest” tech, but it sounds prettty good to me :

http://www.research.ibm.com/tts/coredemo.shtml

When I only inputted “hello there” it didn’t sound great, but if you enter a 200 word phrase it sounds fairly realistic.

With the advances in musical instrument synthesizers and “software synths” like Reason http://www.propellerheads.se/products/reason/, I’m a bit surprised that voice synthesis still has not been “perfected”.

Mike

This is cool development, and, if the software isn’t too incredibly expensive, probably not a bad investment for the small animated movie crowd. I think, though, that if the computer voices get too real, and too customizable, then some really bad things could start to happen. Like, when a murderer has evidence of his guilt on audiotape. Software like this, and widespread use of it, could really put a lot of things like that into doubt. And you know that the guy has too go free if there is a reasonable doubt.

Lol, not bad but far from perfect.

http://webtts.watson.ibm.com/cgi-bin/ttsclient30?text=Hello+my+name+is+Sally+and+I+am+a+25+foot+tall+ogre.%0D%0A%0D%0AOn+sundays+i+like+to+make+pancakes%2C+my+favorite+flavor+is+child+intestines.&voice=HOK&nsfile=ttsclient30.wav

The demo sounds surprisingly good. I couldn’t help but put in, “Good afternoon, gentlemen. I am a HAL 9000 computer. I became operational at the H.A.L. plant in Urbana, Illinois on the 12th of January 1992.” and then alter it in Audacity to where it slows down over time. This guy sounds too friendly to be HAL.

Yeah, that’s what I meant… despite the calmness and evenness of HAL’s tone, he sounded like a barely controlled lunatic.