Denoising images using motion vector controlled temporal filtering

Hi,

Over the weekend I was playing with the notion of using motion vector controlled temporal filtering in order to get rid of high frequency noise in image sequences and since I’m happy to say that the experiment was a success I’d like to share my findings with the community.

The idea is basically this: instead of waiting a certain amount of time to get a clean render of an image sequence, we can render the image sequence in much less time (between 2 to 4 times for a path tracer like Cycles) and then using temporal filtering controlled by motion vectors (which are free in Blender as well as in a lot of other renderers) reduce the noise to a (hopefully much) lower level. For a given frame the denoising is done by transforming the previous frame and the next frame into the current frame using their motion vectors and then averaging the three frames together (practically doing a temporal box blur). This process can be repeated several times on the already filtered result in order to get a wider and softer filtering.

Here are some example images (the render is a single frame from an animated version of the default Cycles scene) - the first is a Cycles render using 10 passes, the second is a Cycles render using 30 passes and the third is a denoised version of the 10 passes render:

http://dl.dropbox.com/u/2929263/cycles_render_pass_10.png

http://dl.dropbox.com/u/2929263/cycles_render_pass_30.png

http://dl.dropbox.com/u/2929263/cycles_denoise_dual_pass.png

This approach, beside denoising, could also be used as for firefly removal by applying a temporal median filter instead of a temporal box filter on the three images (unfortunately this is not currently supported by Blender).

I borrowed the idea behind this technique from The Foundry’s Furnace DeNoiser plugin which denoises film or video recorded sequences by doing an optical flow analysis on them and then uses the resulting motion vectors to drive a temporal filter.

Since this is a post technique it’s entirely done in the Blender compositor, but it needs the 3D scene in order to get the motion vectors from the BI renderer. I hacked the Vector Blur node, so that instead of doing a vector-based blur it does a vector-based image transform operation. The biggest problem (not counting my lack of familiarity with Blender’s code and lack of C/C++ programming knowledge) :slight_smile: was dealing with occlusion artefacts, for which I didn’t use the most elegant approach, but as far as I can tell it works (any transformed pixels with sides that are longer then a certain number of pixels in the resulting image are discarded).

I haven’t done a lot of testing and there are noticeable artefacts in the example image (for ex. the edges of the plane in the lower half of the image), but these are implementation issues which I’m sure can be sorted out.

Since the whole technique is practically a hack, there are quite a few limitations, the biggest being that when a 3D object’s surface changes appearance drastically between adjacent frames (for ex. animated textures which have big changes from one frame to the next, etc.) the technique would produce visibly incorrect results. Some of the limitations can be overcome simply by adding motion vector blur to the denoised images or by rendering the image in passes and then applying the denoising process on the indirect diffuse and the glossy reflection/refraction passes which most of the time would be the noisiest and at the same time would have the least high frequency content.

Unfortunately, I’m pretty sure that I won’t have the time to continue implementing this approach into Blender, so if anyone is interested to continue along this way feel free to send me a PM, so I could pass you the torch :slight_smile:

Thanks and sorry for the long post,
Goran

Hi everybody,

I finally had some time off and managed to do some work on the temporal denoiser. The good news is that I was able to solve the small issues that were present in the denoising wokflow previously. The main culprit was not in the code, but in the fact that the Blender motion vector engine doesn’t like working with large polygons. All I had to do was subdivide the mesh a couple of times and the issues were gone.

The other good news is that I cleaned up the code a bit and made some example blends, so the denoiser is now ready for public testing :slight_smile: I uploaded a tar.gz archive containing a Blender version with the Vector Blur node hacked to do vector transform instead of blur compiled under Ubuntu 10.04 x64, the changed zbuf.c file, the original zbuf.c file and a diff file made from both, the example blends showcasing the denoising workflow and finally the renders from the different stages of the denoise process.

http://dl.dropbox.com/u/2929263/blender_vec_transform.tar.gz

I was able to make a median filter using Blender’s compositing nodes, so the final stage of the temporal denoiser can now be done either with a box filter or with a median filter.

A couple of notes regarding the hacked Vector blur node: the Blur parameter controls the direction of the vector transform; the Min parameter controls the size of the largest polygon produced during the vector transform process (used to clean up occluded areas); and finally never drop the samples values under 32 because the resulting image becomes transparent.

Here’s a frame from the example blends rendered directly from Cycles, denoised with the box filter and denoised with the median filter.

http://dl.dropbox.com/u/2929263/1_cycles_render.0003.png

http://dl.dropbox.com/u/2929263/4_filtering_box.0003.png

http://dl.dropbox.com/u/2929263/4_filtering_median.0003.png

Hope you have fun with the denoiser!

Thanks,
Goran

in second post the links are broken. this sounds interesting quite technical but I think I get the idea behind it and I just posted about this in the 2.6 dev thread ^^

if cycles will have a de-grain node in comp nodes to smooth out nois in compositing.

what about temporal gaussean filtering?
Essentially doing 3D (2+1D) AA

This might have potential :slight_smile: - Especially when the noise-pattern becomes random.

I think what you are suggesting is in fact temporal supersampling i.e. true 3D motion blur. BI already does this, although using a very dated approach. AFAIK, Cycles is going to implement a much more modern solution sometime in the future (hopefully sooner rather then later) :slight_smile: I suppose the main problem is implementing deformation motion blur. Transformation motion blur, both for cameras and for objects should be pretty straight forward.

Yes, the sampling pattern definitely should not be fixed between frames, although IMO a random pattern would not be the best approach.

Goran

If I misunderstood you in my previous post (quite likely) :slight_smile: and you were in fact suggesting using a gaussian kernel instead of a box kernel to filter the original frame and the transformed frames in order to get a softer result, then there is an easy way to do this. To produce a similar result to the box filter, the gaussian filter must be wider (i.e. if it’s a spatial filter, encompass more pixels; if it’s temporal like in our case, encompass more frames). Meaning, we would have to transform a given frame not only into it’s previous and following frames, but also at least four frames beyond that (two forward and two backward). Since AFAIK this is not possible in Blender, what we can do instead is run multiple passes of the temporal box filter. In the first pass we denoise the original rendered sequence, in the second pass we desnoise the result of the first pass, and so on. Using a box filter multiple times in succession produces a result very similar to a gaussian filter. But have in mind that you’ll have to extend the original render at the beginning and the end (two frames, one at the beginning, one at the end, for every pass added).

Edit: The example filtered image in my original post is in fact made with a dual pass approach. You can compare it to the box filter example in the second post to see the difference in noise reduction (ignore Suzzane’s edge difference, because I made some changes to the code between the two posts which should have reduced the aliasing).

Goran

Yup, that’s exactly what I meant. Gaussian Kernel.

Also, if it wasn’t so complex, I’d guess a somewhat ideal Kernel would be the following multifunction:

(4asJ_1(pi|x|)J_1((pi|x|)/a))sin(pit)sin((pit)/s))/(pi^4t^2|x|^2)

|x|=distance from a filterpixel to current pixel
t = temporal distance (e.g. frames)
a = spacial filter radius >0
s = temporal filter radius > 0

this corresponds to a spaciotemporal pixel in Lanczos style.

J_1(x) is the BesselJ function of order 1 and J_1(sqrt(x^2+y^2))/srqrt(x^2+y^2) is the same for circles as sinc(sqrt(x^2+y^2) would be for squares.
So, the above filter basically gives a spaciotemporal cylinder and does a 3D Fourier transform on that.

The two main problems in this are that BesselJ is not algebraic and for integers, J_1(x)/x does not give 0, which essentially means that you don’t need integers for a but rather some odd irrationals. In case of what would be Lanczos3, a should be ~3.238315484166236.

Either way, that’s probably way more complex than necessary…

Filters with negative lobes (mitchell-netravali, catmull-rom, sinc, different windowed sinc variants (lanczos, etc.)) work great as reconstruction filters because they perceptually keep as much of the high frequency content as possible, while not introducing aliasing. On the other hand using them for denoising would be problematic because the whole point of the denoising is to eliminate the high frequency noise. The ringing introduced by the negative lobes of the filter would sharpen the image and accentuate the noise. A gaussian filter would work best for denoising for the simple reason that it’s the softest and has no negative lobes.

Goran

I see :slight_smile:

Well, I’m curious to see your progress.
Also, have a look into the Cycles thread (Brecht’s easteregg surprise), currently final page.
During researching for your thread here, I found something very interesting which could possibly be modified into some kind of “ideal” noise-filtering.
Though the goal is image-reconstruction from low samples, for now.
(Look into compressed sensing and compressed rendering :))
The reason why I posted it in the other thread is because it doesn’t seem to be related to your work here but rather needs a modification of the raytracing algorithm…