Another denoising project has appeared at developer.blender.org (compositing node)

Ace_Dragon · April 24, 2017, 12:44pm

https://developer.blender.org/D2649

The link to the GitLab page has more info and a little insight on how the algorithm works (though I will warn that the node looks complicated to use with the many sliders it has).

It also looks to me like it’s built with denoising Cycles scenes in mind, I’m curious to see where this will go and whether some of the technology gets absorbed into Lukas’ project.

He only has Linux builds right now, someone else will need to build for Windows and Mac.

3DLuver · April 24, 2017, 7:32pm

Hey, Did Build earlier and forgot. added this just because it was denoise based. It works but only on CPU but if you have a killer CPU setup may still be some use. But is SLOW.

Give me a few mins just uploading the build.

This Build includes:

Opencl&Cuda_GPU Denoise system (This is Lukas latest Denoise code system)
GPU Denoise Multi GPU support (Even in Viewport, Def works for Cuda but not tested multiple Opencl GPU)
Scramble Distance added for solbol and multi jitter (works on CPU & GPU) Also added to Supported features render tab
Blue Noise Dithered Sobol
Thread Divergance Sort Reduction patch (Gives 30% speedup in classromm and 8% in barcellona)
Latest Master
Cycles: Implement persistent storage for all scene data (Patch D2613)
New Noise Nodes
SMAA Compositor node
New Denoise Compistor node
Cuda & Opencl Supported
Disney BSDF

https://mega.nz/#!Ao4EFRwL!w8GPRajtJOIQ9TXHuHaJeC0z24ZY2kKIj25yQzjw3pY

Tried a 2.8 Merge tonight and compiled fine, But Didn’t render shit. Ill Keep trying to Merge new 2.8 as a super experimental build as and when i find the blocking issues. Cant wait for these updates and 2.8 when we get a real time viewport that doesn’t suck balls.

Geographic · April 24, 2017, 11:38pm

Well this is not like the previous denoising, it is 2D feature based denoise.
Its a total different denoising method.
A simple example would be say you render a wall or ocean, then it has repeating patterns.
This denoiser looks for such futures and uses them to denoise (as if a single image was rendered multiple times).
Such series are not limited to a single image either, if a feature repeats over several images the algorythm is able to still use it for denoise each image.

OpenCV has a similair fututre called “FastNIMeansDenoisingColoredMulti”, and i think i’ve seen BMW renders where people put it to use.

And here is a nice article about it https://www.cs.tut.fi/~foi/GCF-BM3D/BM3D_TIP_2007.pdf

Mobiledeveloper · April 25, 2017, 1:59am

(if I understand correctly) you can test this denoising algo on-line here

http://demo.ipol.im/demo/l_bm3d/

select a photo and click run (then you can select noised, denoised, original etc)

It works great with random colored noise but (I don’t know what is a difference between random colored and noise from typical 5 sample render) it doesn’t work well with my test nosed render.

I will test this build.

Mobiledeveloper · April 25, 2017, 2:24am

I have tested the build and node

I didn’t changed any parameters yet. It works great. I will change some parameters later. Anyway it’s slow

EDIT: the nodes result is even better in conjunction with Lucas denoiser, I have uploaded 3 images.
10 samples render,
10 samples + Lucas denoising 13 sec rendering,
10 samples + Lucas denoising + filtering nodes 3 minutes rendering,

Attachments

CarlG · April 25, 2017, 7:53am

What are these? New generators, denoise related, or the fac noise optimization?
Not in a position where I can test anything.

Ace_Dragon · April 25, 2017, 7:55am

I also just tried it, even if the results are the best you can get with a node that only works with 2D information, the fact that an average image takes several minutes to denoise might sink this node’s chances of getting into master.

The reason for that is for one thing, you would have a long wait ahead of you if you want to see the changes brought by tweaking even one of the settings (and for high-resolution images with a low number of samples, the calculation time might be even longer than the time needed to render it in the first place).

The algorithm is far better than traditional techniques, I can give it that, but if there’s a way for the code to become much faster, it should be done right away.

3DLuver · April 25, 2017, 10:27am

Yep This Node needs to be converted to use OpenCL, As we already have OpenCL support for compositor maybe someone like Mai can take a look and converting the math over to opencl too.

Mobiledeveloper · April 25, 2017, 4:17pm

This node is awesome. If it will be faster… Who knows, maybe Lucas could use some of the code in conjunction with denoiser to get even better denoising quality

edit: I have uploaded denoising result (see a few post above) with Lucas denoiser (because I have filtered raw frame without Lucas denoiser and it gives a worse result)

edit: I have noticed that CPU consumption was 98 percent (on my 4 cores CPU i5-3470) so it must be multi-threaded version. It takes 3 minutes to denoise - I haven’t tried to change any settings. Anyway, my GPU CUDA power is 4 to 5 times faster than CPU when I render a frame.

So, If denoising takes 180 seconds on CPU (3 min) then on GPU it could theoretically take 36-45 seconds (because rendering is 4 to 5 times faster on my GPU). But rendering may be less GPU friendly than processing graphics (I may be wrong).

So, on the GPU, processing image can be a lot faster than this. Additionally, the bm3d algorithm can be optimized (probably).

So the final version could be fast - 10 seconds or less processing time.

I didn’t touch any parameters in the bm3d nodes so if the parameters have any influence on speed then it could be even faster. Maybe final version will take 1 to 3 seconds to wait? But my graphics card gtx 1050 ti isn’t fastest. On the faster GPU, the final version could take 1 second to process the graphics. We will see what the future will bring

Additionally, maybe this filtering will remove flickering in animation hmm

Razorblade · April 25, 2017, 10:45pm

I think such a system can be improved on speed, by limiting the search pattern
If the search pattern (and the squares) are relativly small …say 16 x 16 pixels or so, and their pattern matching is only say 8 squares around them, and then also check this with the next 9 squares on the next frame and previous frame(s).

Then it would work a lot faster, this could be ideal for movie denoising.
I had something in mind like it to code, but putting in this kind of math would be way better, as my method wouldnt be able to handle movement of objects, while such math wouldnt be disturbed by a moving object (as long it falls within search and compare region).

wolfie138 · April 25, 2017, 11:23pm

2nd pic looks good, face on the train looks very blurry on the 3rd

Mobiledeveloper · April 26, 2017, 2:37am

It could be good algorithm. The best solution for animation I found is AviSynth scripts. The most important part I think is mvanalizemulti and mvdegrainmulti (additionally removedirt but it could be omitted) which according to the description:

mvanalize:

“Estimate motion by block-matching method and produce special output clip with motion vectors data (used by other functions).
Some hierarchical multi-level search methods are implemented (from coarse image scale to finest). Function uses zero vector and neighbors blocks vectors as a predictors for current block. Firstly difference (SAD) are estimated for predictors, then candidate vector changed by some values to some direction, SAD is estimated, and so on. The accepted new vector is the vector with minimal SAD value (with some penalty for motion coherence).”

mdegrain:

“Makes a temporal denoising with motion compensation. Blocks of previous and next frames are motion compensated and then averaged with current frame with weigthing factors depended on block differences from current (SAD). Functions support overlapped blocks mode. Overlaped blocks processing is implemented as window block summation (like FFT3DFilter, overlap value up to blksize/2) for blocking artefactes decreasing.”

https://avisynth.org.ru/mvtools/mvtools.html

so if anyone can write such a code then rendering result for animation could be really great without flickering.

additionally, there is also removedirt (and unsharp mask). Removedirt in avisynth is really great to remove flickering single pixels.

and finally, to remove blurring it could be great to have “unsharp mask” algorithm too (which could be easy to implement).

I haven’t changed any parameters in node so the result could be better and less blurry (I don’t have the patience to wait 3 minutes to process compositor nodes when I change any parameter x) ).

lsscpp · April 26, 2017, 3:09am

Mobiledeveloper:

It could be good algorithm. The best solution for animation I found is AviSynth scripts. The most important part I think is mvanalizemulti and mvdegrainmulti (additionally removedirt but it could be omitted) which according to the description:

mvanalize:

“Estimate motion by block-matching method and produce special output clip with motion vectors data (used by other functions).
Some hierarchical multi-level search methods are implemented (from coarse image scale to finest). Function uses zero vector and neighbors blocks vectors as a predictors for current block. Firstly difference (SAD) are estimated for predictors, then candidate vector changed by some values to some direction, SAD is estimated, and so on. The accepted new vector is the vector with minimal SAD value (with some penalty for motion coherence).”

mdegrain:

“Makes a temporal denoising with motion compensation. Blocks of previous and next frames are motion compensated and then averaged with current frame with weigthing factors depended on block differences from current (SAD). Functions support overlapped blocks mode. Overlaped blocks processing is implemented as window block summation (like FFT3DFilter, overlap value up to blksize/2) for blocking artefactes decreasing.”

https://avisynth.org.ru/mvtools/mvtools.html

so if anyone can write such a code then rendering result for animation could be really great without flickering.

additionally, there is also removedirt (and unsharp mask). Removedirt in avisynth is really great to remove flickering single pixels.

and finally, to remove blurring it could be great to have “unsharp mask” algorithm too (which could be easy to implement).

I haven’t changed any parameters in node so the result could be better and less blurry (I don’t have the patience to wait 3 minutes to process compositor nodes when I change any parameter x) ).

this reminds me of this:
https://rightclickselect.com/p/sequencer/Cnbbbc/introduce-vapoursynth-scripting-into-vse

Geographic · April 26, 2017, 3:20am

@mobiledeveloper (i’ve seen your avisynth script)
Keep in mind that this does currently not work with motion vectors.
And the flickering currently is mainly I believe a de-noise artefact. (do the new AO hacks and texture cull, cause less flicker ?)

I think, the knowledge of movement is allready known inside blender (ea the motion blur option)
So in theory i think there is a speedup possible as there is no need to analyze and calculate again a vectorized result afterwards.

Also as Razorblade described it, I think there might not even be a need to have a 9:9:9 matrix of squares to look for similarities.
As movement of objects is known per object, current square displacement would lead to exact position of previous frame(s).
So in time 1:1:1 square search would in most case solve it. (just think of it that it adds 2 render passes for free !).

>> 1:1:1 a single square compared against a single square over 3 frames

Though despite the speedup, i think one shouldnt render below a certain detail treshold with this.
When i look at the tree rail example, then there is loss on detail although it looks nice.

Mobiledeveloper · April 26, 2017, 8:09am

oo :). Vapour S. I see that you also wanted the filter to be in the blender. I was inspired by original vapor synth thread.

I was trying it but without success. I gave up. for me, Avisynth was easier to setup :).

anyway I hope this node will be faster :/.

Geographic:

@mobiledeveloper (i’ve seen your avisynth script)
Keep in mind that this does currently not work with motion vectors.
And the flickering currently is mainly I believe a de-noise artefact. (do the new AO hacks and texture cull, cause less flicker ?)

I think, the knowledge of movement is allready known inside blender (ea the motion blur option)
So in theory i think there is a speedup possible as there is no need to analyze and calculate again a vectorized result afterwards.

Also as Razorblade described it, I think there might not even be a need to have a 9:9:9 matrix of squares to look for similarities.
As movement of objects is known per object, current square displacement would lead to exact position of previous frame(s).
So in time 1:1:1 square search would in most case solve it. (just think of it that it adds 2 render passes for free !).

>> 1:1:1 a single square compared against a single square over 3 frames

Though despite the speedup, i think one shouldnt render below a certain detail treshold with this.
When i look at the tree rail example, then there is loss on detail although it looks nice.

Yes, this node isn’t motion vector based but denoising result are great.

As you said, the motion vector should be possible to implement in blender and should give much better quality than generating movement vector based on 2D image. I don’t know I hope :).

anyway, maybe if the denoising result in this denoising node will be more consistent by using this node then the flickering of animation could be less noticable.

YAFU · April 26, 2017, 8:25am

There is a big advantage with VapourSynth against AviSynth, VapourSynth is cross platform (Regarding Blender integration).
I have noticed that you like to experience a lot with this . If you want you can install Ubuntu based Linux distro, and try this:

It is really easy try it on Linux.

lsscpp · April 26, 2017, 8:39am

and it also speaks Python

Ace_Dragon · April 26, 2017, 2:14pm

I don’t know if getting this supported by the GPU is the only route to take here.

Chances are, there could be a lot of areas where the code as-is could be optimized and made a bit faster that way, as the code within the patch is complex and lengthy.