Audio Analysis: Automatic Ipo Curves

Hi,

Before I get too deep into crafting my own solution, I was wondering if there are scripts to perform audio analysis. Then, based on the results from the analysis, generate an Ipo curve that corresponds to the audio track.

I’m not talking about lip sync., but something applicable to thunder, drumming, musical notes, wolf howl, pogo stick, rocking chair, etc. Here are some sites I have reviewed (plus Google searches), without luck:

http://wiki.blender.org/index.php/Scripts/Catalog
http://www-users.cs.umn.edu/~mein/blender/plugins/python/meshediting/
http://www.blender.org/download/python-scripts

The solution I am creating can be found here:

http://www.davidjarvis.ca/blender/script/WavFile.py

Note: It does nothing special at the moment. It reads a .wav file and determines the amplitude of each frame. However, when coupled with the Lightning Script, it will eventually do something useful.

http://www.davidjarvis.ca/blender/script/lightning.py

Thoughts?

Very original idea! Perhaps one day it could do lip syncs, though it almost seems like cheating. I’m interested to hear any progress you have.

It wouldn’t be cheating for lipsync because I know for a fact that results would be total crap. :stuck_out_tongue:

o rly? How do you know that for a fact? We have speech recognition software, controlling facial shapes isn’t really a far leap from that.

Heh, I’ll take it you’ve never seen automated lip sync? At best it’s stiff. At worst it’s wooden, jerky and has NO CHARACTER WHATSOEVER. You can’t get any expression, personality, or actual mouth movement into it, the natural flow of phenomes is far too complex for a step-analyzer to break into a sequence of rigidly defined shape keys.

I’ve looked into lip sync recently. It appears fine for 2D animated characters, but 3D meshes seem to be a much more difficult problem. I found these:

http://www.meloware.com/blender/lipsync.htm
http://wiki.blender.org/index.php/Requests/lipSync
http://www.annosoft.com/phoneset.htm
http://www.3dlinks.com/links.cfm?categoryid=1&subcategoryid=71

However, as I mentioned, this is not something I am looking to accomplish. Not only because it is an insanely difficult problem (that is, mapping phonemes to visemes) but more because it has no applicability to my current project. :wink: Plus phonemes and non-speech sounds are worlds apart, so combining the two into a single interface might lead to a tangled mess.

Anyway, if anyone knows of some good FFT (Fast Fourier Transform) or DFT (Discrete Fourier Transform) examples in Python (yes I’ve Google’d … it seems an obscure topic), I’d be awfully glad to read them.

lipSync for Blender is here, and it incoming ;):

http://wiki.blender.org/index.php/Requests/lipSync

Thangalin <- I tried use your script (WavFile.py)… but it don’t working, error in Blender console:

from numpy.fft import fft
ImportError: No module named numpy.fft

You should write tutorial how use it :). Regards

Edit: Ok, I noticed that your script need for working: http://www.numpy.org/

Hi,

Here’s the first draft of the UI for audio analysis:

http://davidjarvis.ca/blender/images/audio-analysis-01.png

At the moment, I am only considering power levels. This would allow you to map the beat of a drum to an Ipo curve. You could also eke out a wolf howl, gun shot, explosion, thunder rumble, and so forth from an audio file.

The options are as follows:

  • Filename: The name of the audio file to analyse.
  • Curve Name: The name of the Ipo curve to create.
  • Curve Type: The type of Ipo curve to create (default: Object)
  • Scale: Power threshhold factor. The processing involves power values that range from 0 to 32768. The Scale parameter allows the volume value to be reduced by this factor. (Otherwise the Ipo data points would be too high to be useful…)
  • Interval: Practically, this is frames per second. Technically, it is the number of frames that must pass between one sample that matches the power range and the next.
  • Attack: Shifts the time back by a number of seconds. Lightning strikes before its thunder rumbles (because light is faster than sound). This allows the Ipo data points to be moved back a certain amount to account for action and sound being out of phase.
  • Decay: How long it takes, in seconds, for the Ipo curve to head back to zero.
  • Oscillate: Exponential damping oscillation. When a school bus drives over train tracks, the springs in the shocks go: BOUNCE - Bounce - bounce - … Similarly, when lightning flashes, the first bolt is often followed by a few more little bolts. This controls the number of subsequent peaks that occur during the decay.
  • Randomise: These three buttons randomly affect the duration of their corresponding sliders.
  • Power Min: The minimum volume value to record as an Ipo data point. For finding loud noises, usually this is set high and the Power Max left alone.
  • Power Max: The maximum volume value to record as an Ipo data point.
  • Analyse: Click this button to generate the Ipo curve based on the values present in the audio file.
  • Exit: Closes the script.The code does not do frequency analysis, which involves FFT (Fast Fourier Transforms) and I may not get around to it. Yet I will GPL the code when finished.

Comments?

Thangalin <- Woooooow !!! You read in my mind ;). I’m thinking exactly on UI for this script… hehehe

Btw. Finally somebody making Automatic Ipo Curves based on WAV file instead of MIDI file (which it was useless or very complicated for people how don’t be a musican). BIG THX Thangalin !!!

Houdini (sidefx.com) has very cool tools built in to do this kind of thing in their “CHOPS” module (www.sidefx.com).

You can download a free apprentice version that has this capability (along with about 95% capbility of the entire package) from their site. The program is very powerful … and a pretty steep learning curve. I’m just a novice at it, but if you’re interested I can get you going with the basics.

There is a very cool demo file that reads in a (musical) wav file, analyses the beats, then converts the anyalysed beats to channel keys, then re-maps those channels to a character that is then animated to dance to the music !

Houdin can write it’s channel info to ascii text, so you could use it (theoretically) for a converter.

Here are the files for it :http://www.stickylight.com/public/houdini-hacker/FrankFirsching/

Mike

Hi, Jed.

I will be writing instructions on how to use AudioAnalysis … :slight_smile:

  • Download and install NumPy: a. http://sourceforge.net/projects/numpy
    b. Select “Download Numerical Python”
    c. Click “Download” beside NumPy 1.0.1
    d. Click “numpy-1.0.1.tar.gz”
    d. As root:
    [INDENT] A. Copy the file to /usr/local/src
    B. cd /usr/local/src
    C. tar zxf numpy-1.0.1.tar.gz
    D. cd numpy*
    E. python setup.py install
    [/INDENT]Yadda, yadda, yadda.

WavFile.py is a temporary measure, just to see how to go about analysing a .wav file.

Typical programmer :yes:

Code first.

Docs … (maybe) … last :smiley:

Mike

Hi, Mike.

Yes, it certainly does look powerful! However, I need something that is quick and simple. It’s really academic at this point to create this tool, as I’ve already accomplished what I wanted to do, but manually. That is, create lightning flashes that dance to the thunderbooms throughout the soundtrack of a 2 minute short I’ve created.

That said, I want to adjust the timing of the thunder so it does a better overlap with the music. And I don’t really want to adjust things manually again. So since I’ve got all three components nearly finished (UI, Analysis, and an Ipo Curve generator), it’ll take shy of several hours to complete. And I’ll be learning the Blender Python API as well. :slight_smile:

So for now, for me, I don’t think Houdini will do the trick. <grin>

Heh heh.

Mike! Play nice. :wink: Just so you know, I’d already discussed the whole kit-and-kaboodle with another fellow in several e-mails. The source code is fully commented (as it stands). And the user manual was written in this thread, before the tool was finished. :wink:

As for docs coming last … well, probably. But when they do, they’ll be on par with the tutorials I’ve written. :slight_smile: I also hope to get a lot of feedback from users. Besides the obvious, “My gracious, this is one slow analysing tool.”

LOL, just yanking your chain. … this actually looks pretty cool, and I’m dl’ding the numpy stuff now.

Where is “tar zxf” on the Windows Start menu ? :evilgrin::evilgrin::evilgrin:

Mike

@Mike: tar zxf my foot! How did you manage to copy the file into /usr/local/src using Windows Explorer? I was sure I was doing my best to make this an anti-Windows piece of code. Which means I’ll have to toss in some exec statement of a Unix program that exists in /usr/bin, just to make sure the Python code fails on Windows. :wink:

I likely will not write installation instructions for Windows, as I do not use it, nor have a copy on my main computer. Hopefully someone else can help me out with that part <wink, wink, nudge, nudge>.

Also, I’m hoping that numpy won’t be required – at least until the frequency detection is written. It’ll make installation a lot simpler. It is my understanding that Blender 2.43 would have to be recompiled to make use of the numpy library?

Hey, Jed.

Before you get too excited, understand that this will not have frequency detection within it. At least not initially. I’m only concerned with power amplitude – that is, volume. And even then, it will take some twiddling of the parameters to get the Ipo curve you seek. I’m talking with someone involved in a Blender MIDI project and, with any luck, we should be able to integrate his analysis UI to give this tool a better front-end.

No promises. :slight_smile:

Hi,

Good news. I have integrated the major components of the audio analysis tool. It now generates Ipo curves from the volume of a .wav file.

More soon!

That’s looking nice! Keep up good work.