Thursday, March 22, 2012

Voice Synthesis


[for those just skimming through Digital Diner today, be sure you watch the Miku Video at the end even if you skip the stuff in the middle]
I've always been interested in music and sound.  The voice is one of the most intriguing forms of audio to us human beings.  Of course, once you tie that in with a little technology, it starts to get really interesting.  The video above, was made 73 years ago and shows the state of the art of voice synthesis at the time.  Not at all bad considering the analog technology that was available at the time.  At least you can understand what the voice is saying.
Now we have Siri on our iPhones and synthetic voices in our GPS that talk to us on a daily basis.  While those voices are getting better, they still sound mechanical and uninspired.  Laurie Anderson (who is now married to Lou Reed - who knew?) has said that she considers conversations to be a type of performance art and that speech is really a type of music.  Truly human sounding speech is performed, not generated.  This week Yamaha showed a keyboard that allows you to create a vocal performance live on a special keyboard.  The video below is eerily like the the older version above, but of course the audio quality is much better.



The technology that Yamaha is using is called Vocaloid, and it was specifically created to synthesize human sounding singing voices.  This technology has already been used in some amazing ways. 

Miku
The video below shows Miku - a favorite example of technology here at Digital Diner.  Miku doesn't really exist.  She is completely synthesized.  She is a hologram who happens to perform on stage in front of people in a concert style.  (I was going to say that she performs "live" in front of people, but I suppose that isn't really appropriate) In fact, even her voice is synthesized using Vocaloid.  So, here we have a singing star based on a synthetic voice.  I think we are at least on par with HAL from the movie 2001 singing Daisy as it was being unplugged (which BTW, was a tip of the hat to the IBM computer that played Daisy 1961).
Here on Digital Diner, we've talked before about whether or not you should believe what you see, but I think that Miku takes this to new heights.  You know she isn't real, and in fact that is the point of going to see "her".  Catchy tune aside, it seems like quite a social experience.  Just look at that crowd wildly waving their glow sticks in time with music from a hologram.  ...and do you even want to get a back stage pass to this performance?  I'm not sure.  I think the social implications of these performances will be the basis for many a psychology PhD to come.