I started several months ago with the first implementation and experiments, so I wouldn’t call that quick :). Neither has it been easy to implement. It has been very challenging indeed. Nevertheless, I believe it is doable to create this kind of solution for Cycles!
Oh shoot, that’s my fail. Away from machine now, try in a couple hours. Also, you could increase the training sample numbers (not blender samples) and robustity by slight rotation/movement of the camera.
What exactly do you mean by slightly rotating/moving the camera? Do you mean this would happen in Blender and the frame would be rendered again like this or is your idea to do it with the already rendered frame to achieve better data augmentation?
Yes, move the camera and render again. You’re going to start to get diminishing returns on just the changing seeds. Yes it is similar to augmentation (like rotating the images) but in fact in provides new ground truths that will help generalize. You see the model will be greatly helped by these ‘similar but different’ angles, and will probably store a better compact hidden representation.
This could be useful for fine tuning. If you have a working neural network and you want to refine it to work better with a specific type of image.
I believe the rendering time would be better invested if the renderings differ significantly, such that the representation is getting better. I think it would be more valuable to move/rotate the camera significantly and to change the lighting as well to get a very different scenario. The lighting may be simply modified by switching from a day time to a night time HDRI.
I am not aware of actual deep learning projects, where they achieved better generalization by having many groups of similar examples rather than a larger spectrum.
Yes moving the camera more would be better. I guess I mentioned ‘a little bit’ just to make sure the camera was facing the objects of interest, without additional scene logic. More than anything I’m just trying to be realistic about getting as many training samples as you can out of each scene. These datasets need to be pretty huge to be of use.
Also, seeing as you wanted the resolution to be powers of 2, I’m wondering if you are using full strides (or single tiles) in other words if your window/model input is 32x32 and the image is 64 x 64, you’re only taking 4 training samples? Why not slide pixel by pixel and get 1024 training samples? By keeping the stride 1, you can accept images at any resolution.
Also which compression format do you want my render folders in?
I agree with you that it is very important to get most out of every scene. When the camera position and the lighting of a scene is changed, the number of samples per pixel changes as well. That’s why I believe it is better to do all that manually for now. I also want to avoid situations where people waste render time because of this.
It is multiples of 64 because currently a tile size of 64x64 is used. All the images are preprocessed where they are sliced into those tile sizes. The image is indeed split in the way you are mentioning. In an early version I tried the extreme version where the tile is just shifted by one pixel. This was a too aggressive approach which did not work out and it would not work with the current implementation for various reasons. But it is definitely worth a try to create the 64x64 tiles and not shift by 64 pixels but 32. If this shows an improvement, further experiments can be made. Thanks for the idea!
Every commonly used compression format is okay.
Thanks a lot, I really appreciate it!
Have you thought about using AWS (Amazon Web Service) or a similar service for this? CPU power is increadibly cheap and setting up a rendering instance with Blender isn´t all that difficult.
Yes, I have considered it. If it was only about getting some renders done, this would be a perfect fit.
Besides rendering, it is also about getting a solid amount of blend files, which are visually very diverse. For each of those, it has to be figured out how many samples per pixel are needed. I have worked with plenty of files which produce fireflies, even when rendered for a remarkable amount of time. For those, it is necessary to figure out how to avoid them. This can be done by moving the camera, lowering the intensity of some very aggressive lights or whatever else works. All this together takes an incredible amount of time.
As you know there are also several render farms for Blender and I believe some are using their own hardware. Once the results are close enough to being production ready, I will get in touch with them, because everything has to be rendered again. In Blender 2.8, there are going to be more passes for Cycles and they have to be rendered and the neural network needs to be trained with those as well. Right now, Blender 2.8 is simply not stable enough to use it for such a task.
I had a close look at your rendered images. There seem to be many fireflies or at least rendering artifacts. Can you see them too?
Oh yeah, there’s massive artifacts, the strength on the emission plane is like 450 or something. I don’t know that any amount of samples will fix it. I had to run post filters on it.
Using filters is unfortunately not an option. I have used some files from BlendSwap which were not that extreme, but still comparable. I was able to solve it by lowering the emission rate and adding more emission planes. This certainly changes the style to a certain degree, but it was unavoidable to create a usable training example.
Yeah, I actually choose that noisy blend for a reason, which is that you never have any idea what kind of blend the user will be using it for, and you need all kinds of extreme case for a general solution. For scenes that converge and clean up real nice, you don’t really need a denoiser for those.
That level of noise is very common in Cycles interior scenes, which is what Photox scene seems to be. So I think the network should learn that kind of noise in some way.
Also emission plane “lightrim” being very strong, it seems to contribute noise (like Photox said )
Strong like Bull!!!
@Photox, this makes it a perfect file to find out whether the neural network was able to generalize enough to cover those kinds of cases.
To train the neural network, we have to feed it with two kinds of images, the noisy input and the noise free output. That’s why it is necessary to have the noise free end result.
@YAFU, you are absolutely right that it has to be able to deal with this kind of situation. But those examples are not suitable for the training, because we would tell the neural network to produce this kind of result, even if it is clearly not what we are looking for!
edit: You could just clamp a little, and then jack up the sample numbers on this, one probably 10 - 20,000 would work.
Regarding noise in this kind of scenes, I’m sure that by raising the sample value you would get an image without noise at some point. The problem is that this is always a very high value in this type of scenes and it is a common problem in interior scenes with Cycles.
Regarding the fireflies, well, this is always a pain in the… Currently Blender from master is configured by default with Clamp Indirect = 10 in new scenes. But you apparently can not use Clamp values.