I have been trying to get into blender to make short animation. Using different tutorials and material I have got a hang of lots of stuff: NLA, shapekeys, drivers, rendering and so on.

But the complete workflows are still a bit unclear. Like what is the common workflow for adding audio to an animation. Is it done in VSE or in external tool?

If it is done in VSE, how can one then export the audio as a single file? I suppose I have to render the animation as individual frames, merge them and then merge the audio.

If it is done in external tool I suppose I first get some kind of pre-render that I take in external video editor and add sound assets. But how the animation can then be efficiently edited/synced with the audio/speech.

I’m sure this something very simple, but I just haven’t been able to find anything about it.

You can (drag and drop) a wave file into the VSE to add an audio track. If rendering to a format that supports audio, it will show up in that file. Regardless of whether you’re using the VSE for your visual rendering or Cycles or Eevee or whatever. Obviously, an image sequence is not a format that supports audio, but an image sequence can be combined into a format that does include audio in the VSE. An .avi or an .mpg does support audio.

I wouldn’t use Blender for audio editing, not sure what extent of audio editing it’s capable of, so I’d just be dragging .wav files into the VSE. I could just as easily be dragging those audio files into some other sequencer if I wanted; there are any number of programs, including free programs, to replace the audio in a movie file.

For syncing animation to sound, it can be important to change the sync mode to sync to audio or frame dropping (timeline viewport, “playback” dropdown), but there are significant performance issues that can be involved-- Blender is just not capable of rendering animations at a rate sufficient to keep up with audio. It’s often useful to create markers on the timeline to correspond with sound events. If trying to sync to a beat, it can be useful to create some visual indicators of that beat temporarily (like by baking the sound to some fcurve.)

For syncing sound to animation, Blender’s VSE allows one to split up video files and shift video files, and I imagine it would allow the same with audio files, although I’ve never done that. I imagine there are plenty of other sequencing applications that can do the same.

I can use VSE to add audio and I think fcurve baking will also be very powerful, but I still quite didn’t get what might be the common / efficient workflow for adding audio to animations.

So lets say I have a character, a speech line and a short music clip. Character would say the line and spin along the music clip. And lips need to sync with the speech and the spin with the music.

Should I:
A. add the speech and music clip in VSE and sync the animation with it right in Blender.
B. pre render the animation, import it to an external video editor, add audio. If the audio and animation are not in sync, I go back to Blender, make changes, then pre render again and import to the video editor. Rinse and repeat until animation and audio is in sync.

A. sound logical, but as animations should always be rendered as image sequence and there is no way to render just audio, it seems this is not the common way to do it.

B. sounds more powerful editing-wise, but quite quite unintuitive as I need to keep making prerenders and keep re importing them to external editor.

What would be your workflow for such a task?

If you need to sync animation to sound, you need to load the sound into Blender and animate it with the sound running, your (A). You don’t want to go back and forth.

There are no "should"s in 3D-- or rather, all "should"s are in the form of “if you want x you should do y; if you don’t care about x then y doesn’t matter.” Rendering as an image sequence is nice for some reasons, rendering as a video is nice for some reasons. If you want to check the rendered sync, then render directly as a video.

If, at some point later, you want to render as an image sequence, then you can render it as an image sequence and load that image sequence directly into the same file and render a video instead of 3D-- but, you already have a .wav that you’ve loaded into Blender, in order to sync the animation, so you can just as easily load both that .wav and your new image sequence into a fresh file, or into a different sequencer.

It doesn’t sound clear to me whether you want to work from audio->animation or animation->audio. If you want to work audio->animation, you don’t need to render sound in Blender, because you already have a sound file: that is your starting point.

Well basically I have want to make an animation and I have audio files I want to usein animation. So I suppose it is Audio → animation then.

Anyway I guess the proper way is to add audio to Blender and sync it there. Then I export audio in videorender and later make final render as image sequence and then combine the both in videoeditor for final product.

You can render the audio straight from Blender, under the render tab, or I usually do a viewport render to check the animation, which can then be put into the editing suite so I can make sure the sound syncs up with the final render image sequence. To do a Viewport render, set your render output to video (of whichever format you prefer) and ensure that you enable the audio under encoding. Go to the “View” tab at the top of the screen and select “Viewport Render Animation”. Also, make sure you’ve chosen an output location, so you can find the video later. If you’re happy with the animation, do your proper render (images if you like) and then just overlay the images in the editing suite.

This is what a Viewport render is like. I do it without shaders, hair, subdivisions, just to speed up the render process. This is an outtake from an upcoming project, btw. Not a final project. Sometimes, these 3D models get carried away on set.