Experimental 2.77 Cycles Denoising build

On the subject of training a denoising network, Disney tends to have an advantage over the BF in that area because they have a huge amount of content they can run through the system.

In the paper, their network is based on the output of nearly 1000 shots (and that is without training for volumetric data). I doubt the BF has 1000 totally unique renders they can pull out to ensure the network even works with corner cases (users will have to step in, the challenge being that they will need to provide 100’s of scenes). It would be doable, but not without a massive effort to get a trained network to commit to master.

Depends on their budget but the community can surely help.

@cpurender, the authors even mention that they only used it for single frames. That means, you can expect quite some flickering when you apply it to animations. That’s just one of the limitations.
Just because it is a neural network does not mean it just works!

That’s similar to neural transfer for video. If flickering is a problem, you need to connect the information between frames to keep the noise/render consistent. This can be achieved by having the previous frame as additional input.

The main point is that if you want to have a production ready solution, it is still a tremendous amount of work! It is still not trivial!

I can’t estimate the effort at this moment.
But since we still need training data anyway, how about start collecting high quality renders soon?
If nobody can, I will try to manage my spare time for this on christmas holidays.
Starting from creating a new website for people to submit final renders with all passes.

@Dantus, no not realy wrong, but its just that for this kind of noise, training itvwould not be of thousands of completed renders.
Just some tiles, in my previous attempts i tiled an image, took random about 100 tiles or so to train the network from a single image.
Then based upon those 100 tiles processed all remaining tiles. Sure a few images more wouldnt hurt to train it. But its not that thousands of images would be required.

Thousands of images are required for very deep neural network to find dogs /people and tranlate it to text, or to describe a photo.
Because its the way how those networks are build, layer by layer is trained for a specific future (ea circle, line arc, color etc etc).
Each layer initially gives a matching % for matching-ness to their original goal. But over (hundreds of images) (because of neural feedback) some of those layer adjust their goal (ea find elipse instead of circle). Since all deep layer are connected lots of calculations are required between those layers, multiplied by the amount of required images make such networks heavy on normal hardware. (doable but heavy).

Denoising isnt such a problem its more about optimizing a blur operation.
a 2D NN or a temporal net for video are a form of convulsion operation where x nodes get in and a less pixels gets out optimized.
a DNN are naturally very good in …how to explain… areas of equalness… ( they can find bounderies, like weather pressure maps) and that could go hand in hand with a form of blurring where it would take the average of certain surounding pixels who mathces most but others not. (i feel kinda tempted to also once do this without a neural net…maybe later, simpler but its less interesting to me).

The temporal (time frames), would rather be trained averaging over frames, with respect to moving object, think of the frame data of pixels stacked upon each as a voxel in this 3d matrix, find n pixels who maches most closely to average a certain pixel on a certain frame. It would be up to a neural network to pick the proper ones. Well averaging, might be oversimplifying how a neural network could do this.

@Cpu render, having access to animation in highquality and low quality (with and without automated SEED in rendering) would surely be of help for what i have in mind, maybe others can use such data as well to train their NN, ideas.

CPUrender, would you be able to store those movies as image series ?
So i (and others) wont get mpeg distortions or h264, preferably just png files.
It would be fun if more people tried it, cause then we get some competition ;).
Perhaps even the NN comunities will kick in , as new NN goal, to have some standard noise problem.
As far as i’m aware there are no such test data sets for neural networks yet.

One common trait about neural networks right now is that the network itself tends to be a very naive system.

For instance, the networks knows how to successfully denoise part of an image without losing detail. Now let’s make a few small changes, chances are those changes can throw the algorithm off (which results in poor-quality output). It’s not a true AI in a sense in that it can’t make its own assumptions for areas it can’t recreate from pieces of the examples. Simply put, all the algorithms do is create an amalgamation of content it has already seen.

Just wondering, does anyone know what Lukas has been up to lately? I ask because there’s been no activity from him anywhere on the developer site and this forum for nearly a month now (the guys at Theory Animation might have an idea since they hired him to work on features they need). Does anyone know?

We need all passes. I only know about the OpenEXR Multilayer file format, which can contain all passes.
My initial idea is to collect about 10k multilayer exr files, 50 MB each, 500 GB total.

You are wrong about the amalgamation of content, please read the paper above.
Neural networks can generalize very well, they only need sufficient diverse training data.
Don’t forget that we are simulating physics here, there’s no perfection, and there’s also no better solution at the moment.

The main issue is to get an understanding what good training data is. In the paper they may not have chosen the right approach, but clearly run into limitations which are important for the practical usage. They were searching for noisy examples and maybe it would already be sufficient to add examples with lots of details, but little noise. However, once such limitations are found, it is necessary to get training data which can be used for those cases.
The easiest way to find those limitations is by simply testing the network on a large, broad set of noisy images for which the clear version is known. For this, you don’t need a lot of training data, but lots of examples to find those weaknesses of the neural network(s).

This is clearly needed for practical purposes and it is unknown how a working solution is going to look so far. It may be as simple as described by you, but it could also be a lot more complicated. What’s very clear for me is that getting training data and examples for this is going to be far more challenging.

Well to create the training data isn’t hard i mean its what people normally do with blender create rendered animations.
But my own system isnt realy good at rendering, with low average quality it takes me a day todo 250 frames.
For testing a temporal denoiser, my PC is not optimal for rendering;
And i also cannt change the procesor or gpu of my laptop so, i’m kinda sticked to that quality.
But some people here do have the hardware for it (l remind BMW test scene scores),that one takes me about 8+ minutes.

PS the base reason i think denoise shouldnt be a deep NN is also because neural nets are good at ‘domain’ guesing.
train that 2=5 3=6 5=8 then they usually get good estimates on non trained input such as 4=7
They’re also good at spotting when something is a bit above or below normal. And they’re realy good in finding relations in multiple related measurements. They learn what weight more or less into the final solution.

maybe even the temporal math would work as well on just single channel RGB data of a single scan line, oh i feel tempted to test that out, but i’m still optimizing it and at the moment (as of today the code is a non running state, as i did some code design changes); i’m building some data ordening toolset first around it (for algorythmic testing), ea building my NN toolbox. (as i had some time to code today).

Just a draft and a high quality render, that would be enough.
Perhaps a few variations ea 50 100 and 200 samples to compare against 2000? samples. (i cannot render the high sample counts).
A few of such (short) movies in png format (separated images, so you dont get mpg compression noise into it.).
When having a few of such movies with some situation, it can then later be extend to what people think might be difficult for a NN

There are many types of neural nets the one that you describe that train on images.
For example they can learn to draw :
https://www.fastcodesign.com/90128734/dont-blame-us-if-you-waste-your-day-with-this-neural-network-drawing-tool

Though the things Razor seams to refer are ‘simple’ networkss like these who can classify inputs ea:
http://sharktime.com/us_SharkyNeuralNetwork.html
In such network the dots in an area can move within their area without affecting the area shape.
So such network dont have the problem you described, cause they look different to the information itself. The network would have no idea what it is denoising (it wouldnt know if its a face or cat), but it could still denoise within the shape (adjust color ‘positions’ by tiny amounts)

For time predictions there is also interesting stuff perhaps he uses code something like this :
https://machinelearningmastery.com/time-series-prediction-lstm-recurrent-neural-networks-python-keras/

But he might be on something more special since he coded something earlier too, but never released it.
Maybe he got new ideas now, or more time to code .

I wonder if a NN is the way to go for this, perhaps more simple directly coding a blur that tries to minimize the last 0.05% of noise based upon average frames if collor match is 99.95%. (why would the last .05% require so much overhead in techniques??, a proper blur filter would work quite well her (some avysinth script).
On the other i’m curious what he might create, it wouldn’t be the first time i got surprised by a NN.
At least his past neural net didnt destroy the images, but they got a color graded view.
That kinda proved his network processed it, but it wasnt an optimization yet, rather degradation.
But he seamed happy about the results, so something must have been going right then there, i guess.
Well i await his improvements

You don’t need perfection. It’s possible to start from any point and improve gradually.

Neural networks can generalize very well, they only need sufficient diverse training data.

By diverse I meant mixing up as many kinds of shapes and shaders as possible. The main problem is render time if we want high quality. That’s why my only concern is power consumption, I pay 30 Cent € for each kWh.

Perhaps a few variations ea 50 100 and 200 samples to compare against 2000? samples. (i cannot render the high sample counts).
A few of such (short) movies in png format (separated images, so you dont get mpg compression noise into it.).

Final images alone don’t help much, I have tested this case using Waifu2X and some others.
You need all passes (also for experimental purposes) and the NNs won’t generate the final image, they just denoise some of the passes instead. Disney denoises the diffuse and specular illumination.

Waifu2X experiments on final images for reference
https://blenderartists.org/forum/showthread.php?406467-Post-production-denoise-experiments

From that i make up, that its a deep multi layer network.
Sure that can, but its not what i’m after, i’m only after denoising (not even future detecting)
I wouldnt need to know a circle, or apple to denoise it, (or all previously released movies from Disney).

Therefore to me final high-quality image is very…no better say extreme… important to me, my NN wont train without it, coding it wouldnt make sense either then. Cause i’m training random tiles against that goal, i know some shaders areas are more prone to noise, but well so be it. For example classroom 30samples is overall noisy, its such noise i try to tackle, its not the BMW that is almost ready.

To get rit of some misconception i think its never 100% realistic if you use methods like these, its about acceptable improvements.
Just like the current builtin denoiser, cause the other option would be rendering a week for only 250 frames…

There is though a good chance that working on individual shader passes will work as well if final image works too. And it might even be easier, but currently i’m not after such a goal.

Isn’t it that what we all want here?

The final image contains too few information that’s why we need the passes.
I am confident, using the Disney solution alone would improve the classroom case too.