E-Cycles - Faster CUDA rendering and better AI denoising

(mathieu) #55

Well, it depends on the processor you have. The builds only improve OSL rendering, which is not the case if you use a GPU. In my test, it didn’t make a big difference with or without CPU. The problem is, when the GPU becomes so fast, your CPU may remove a part of the job and at first make it faster, but then the GPU has to wait for the latest CPU tiles to finish rendering, which kills the whole render time.

And it consumes much more energy, creates more heat and noise. So I just leave it on GPU only.

Already, with Vega64+1080Ti together ( I took a patch from brecht to get OpenCL and CUDA working together), and those speedups, it’s so damn fast that the BVH build time is often the biggest part of rendering.

I could get the CPU rendering 20% faster by making the builds with LLVM if customers ask.

(mathieu) #56

What do you mean? It is compatible with the denoiser of course and also the AO trick I added for 2.8 (available in the buildbots. So if you use AO (about 2x faster) and half the spp (also 2x faster) with denoiser, it should be about 2x2x2 = 8x faster, let’s say 7 to be sure. I didn’t add the tricks which I use in the table above, to make it clearer what the optimizations brings.
The denoising process itself is already fast enough, at least in my work.

(CarlG) #57

Not really. It’s listed as an unbiased renderer, and I’m guessing it’s because we can set it up to be visually unbiased and quite mathematically unbiased (russian roulette is said to be mathematically biased but visually unbiased iirc, and is out of user control).

In practical terms, it would usually be visually biased from limited GI bounces, light threshold setting, AO hacks, MIS, light portals, caustic rays etc. Pretty much any setting we do to make it go faster will introduce slight bias, some more visual than others. Turning off caustics (without even fake compensation) is a sure way to make the image darker, because all glossy rays are terminated for lighting (everything has fresnel, right? :slight_smile: - that’s the downside).

Considering the ridiculous amount of samples required for caustics to converge (no blur allowed I guess), using Blender in a very unbiased way, appears near impossible at least if physically plausible (fresnel -> diffuse/glossy) materials are used. I guess you could do unbiased cornell box using diffuse only materials.

So in theory, I think Blender can be very unbiased, but nobody would use it that way in practical use.


Comparison & Difference

(emu) #59

Another way to tell if an integrator converges is to subtract one image from the other and paint positive pixels in a different color than negative ones:
There are no caustics in this scene and the geometry is very simple, so this scene should be possible to render in an unbiased way. I don’t know about the render settings used so all I can say is that at least one of these two images is biased.

I admit that both of the images are pretty and I could not tell the difference by looking at each of them. It seems that E-Cycles used somewhat shorter light paths, given that the direct light is brighter and that the back of the wall is darker.

// bookworm note: Multiple Importance Sampling is an unbiased technique (as long as it combines unbiased estimators). Russian roulette is a somewhat risky way to get rid of the bias from path length because any limit to path length makes the integrator biased.

(Komposthaufen) #60

Here is a more complex scene.

This is pretty much a worst-case test for raytracing, lots of Glass, caustics, reflections from caustics, volumetric materials and many small light sources.

Again, GPU only (CPU has no performance improvements over the normal Blender builds).

Windows 10
Nvidia GTX 980 with 4GB
900 Samples

Normal Cycles: Time: 40:04.81 Peak Memory: 643.69M

E-Cycles Time: 25:13.23 Peak Memory: 655.70M

The Speedup is amazing
E-Cycles looks indeed a lot cleaner with the same sample values especially on the glass.

Cycles is always biased. As soon as any render engine is tracing shadow rays, it is actively looking for light sources and therefore it’s biased. This doesn’t mean that this is bad in any way, it’s actually smart.

(mathieu) #61

Interesting guesses, but false :slight_smile: If I used shorter paths, this would indeed be biased. It’s what my AO patch that will be in final 2.8 does. The only way to know is to do the course :wink:
Cleaner, faster, what to ask more ? Even faster! the new build for RTX brings up to 12% on top, @Komposthaufen you are welcome to install it and report.

(Komposthaufen) #62

I downloaded the RTX build and tested the same scene, but it’s significantly slower on my GTX 980 (maybe it’s to old).

RTX E-Cycles Time: 30:25.40 Peak Memory: 655.69M

The result is nearly the same to E-Cycles, only a few fireflies are different.

@bliblubli Is this Build using Optix?

1 Like
(mathieu) #63

no optix, it’s just that Nvidia fixed cuda 10, so it now supports older cards with Cycles. So I compiled with it. The 1080Ti is 6 to 12% faster with it. Sad it does the contrary on older generation. I’ll continue to provide both then, 9.1 and 10.0 cuda kernels.


I don’t want to prove nothing.
For me, it’s as simple as the fact that of 24h cycle the bright part is the day and the dark is the night — visual difference, while same values are used (especially with one and the same engine) generally mean something is wrong.
The last example (ayreon2) show it even more.

Trying to sell the same “tech” to any other render engine & developers should be ready for a war :wink:

(mathieu) #65

Do you mean it’s bad because you have less noise for the same sample count so it changes the image? Try 2.78 and latest 2.78 and compare images of the same render, you can get same differences due to changes in the russian roulette by Brecht. Or do you think Cycles should keep it’s noise level and noise pattern forever so that those image comparison stays black?


There’s nothing to think about, because nothing was really explained.
To me the difference in brightness is just too much. Simple as that.
ie. There were similar cases with Corona development (sampler) that introduced somewhat inconsistent results (but were way less prominent) yet most were considered as bugs and got fixed.

(mathieu) #67

I worked with a company making renderings for Zaha Hadid architects and they found the renderings done with those patches good enough. So at least until this level, it pleases clients. Maybe there is higher.

And I spoke with one official cycles dev. He has the code and their is interest to get the tech I use here (which comes from a research paper and which accounts for the changed noise pattern) in master at some point. You can then discuss about these point during review. As I said. Everything in here will be submitted for review after a year.

(mathieu) #68

Happy new year everyone,

you can now get only one month of updates if you want to.
And people joining during the first week get the 30% off for the first month, but also the following ones if they stay: https://gumroad.com/l/vkTeQ/66cx403

(BloQi) #69


I bought to try it out and it is really cool, works great with a 1080Ti
I don’t know if it has been mentionned before (if so, I’m sorry), but there’s no default addons with it ? Like node wrangler, etc.


I copied the addons from the official buildbots, it works.
The speedup is indeed insane, but the most cool part is the auto tile size. I found it so cumbersome to benchmark every scene for the optimal size. Why not simply remove the UI part @bliblubli? It would make some room.

(mathieu) #71

Thanks for reporting. I added the contrib addons, you can find the new build is on the product page.

@janbauer ok, it will be available in the next release.

(cactus1138) #72

Its there a way to switch off thsi automatic tile size ?
In my case with this feature it render way longer than standard build on GPU+CPU with 16/16 tiles

(mathieu) #73


yes it’s possible, will be added in the next build. However, you should really check if CPU+GPU is faster than GPU alone. Only if your CPU is much faster than your GPU will it bring something (like a GTX 1050 on a threadripper).

Can you please report your times on BMW with CPU+GPU and with GPU only? It will be 16x16 anyway on this scene. https://download.blender.org/demo/test/BMW27_2.blend.zip

(beep) #74

Is it possible to get OSX builds?