Cycles Development Updates

Less is better:

Radeon RX 5600 XT: 172.6775
GeForce GTX 1080 Ti: 233.708
Radeon RX 5500 XT: 263.773
GeForce GTX 1070 Ti: 292.179

Be careful with two important things when choosing a card. In addition to render time also take into account the vRAM amount on each card version. And regarding OpenCL, the times that I show are the average/median times for all the scenes and all OSs. Apparently people are reporting that OpenCL does not work equally well in all scenes. I didn’t take the trouble to make comparisons for each scene individually, which is possible from the Blender benchmark filters.

Edit:
I don’t want to influence card purchase decisions just by showing render time. Currently AMD users are having a lot of problems:
https://blenderartists.org/t/blender-2-83-lts-and-amd-gpu/1233087

Also, I think that for OpenCL it is still not possible to use a precompiled Kernel, therefore it is necessary to compile Kernel every time you open the scene (one of the reasons why I would not use an AMD gpu for Cycles)

If the rumours are correct and the BIg Navi is effectively 2x5700XT with hardware ray tracing AMD could be a very interesting choice. We can see the compute performance of Navi is good in some scenes and with ray tracing we can probably expect a healthy boost like Optix gave nVidia in Cycles.

Now that Optix works with Pascal GPUs I’ve noticed it compiling kernels when the scene is opened so

2080Ti Optix (CUDA) vs 5700XT
BMW 17s (35s) vs 78s
Class 72s (113s) vs 162s
Fishy 32s (72s)vs 116s
Koro 55s (115s) vs 76s
Barber 815s (490s) vs 302s
Pavillion 93s (217s) vs 474s

Optix is generally around 2x faster than CUDA except for the Barbershop which CUDA is significantly faster.

When price of the 5700XT is factored in the it stacks up extremely well with CUDA and thrashes the 2080Ti in the Barbershop bench. I’d love to know why that would be in this scene. However, Pavillion is extremely slow again I’d love to know why there is this huge disparity in results.

5700xt is approx £425 vs 2080 TI £1250 - £1635 in the UK. I have a feeling AMD have made a big step forward with RDNA2 looking at the game consoles tech and unless nVidia have a rabbit to pull out the hat AMD could be a very tempting even with kernel compiling issues.

By Patrick Mours: Optix denoiser now works for viewport and final render for all the features when CPU or CUDA devices are selected from System settings, regardless of features supported by Optix Render (as long as you have a compatible nvidia GPU)
https://developer.blender.org/rB1562c9f031538219da30404a64e2a187560e5e3c

6 Likes

OptiX only has to build the kernel once for a particular Blender version + Nvidia driver combo. After that it’s loaded from the cache of last time you built it.

4 Likes

This is also a ridiculous kernel issue with LuxCore - compiling them takes ages.

I cannot in good face ask my students to do this.
All students are advised to only use NVIDA cards and only go Windows.

In my 3d photogrammetry app Metashape the AMD 5600XT performs very well like the GTX 1070Ti. ( I have both)

Did I read right that you can run optix now even with a CPU?

Blender 2.9 also has viewport Intel denoiser which can help a lot when setting up lights.
Cycles vs Enscape is not really that fast.

Sometimes with seeing the viewport denoiser at work now I am curious about the usefulness of using eeVee for interior / archviz.

No, that is not like that. Optix AI denoiser is always GPU and you must have a compatible nvidia GPU. The recent change is that you can use Optix denoiser regardless of the device you use to render in Cycles (CPU, CPU+CUDA, Optix renders). And now from tonight builds the Optix denoiser will work even if you use Bevel or AO nodes with CPU or CUDA (which are not currently supported for Optix render)

Edit:
I have edited previous message to see if this is clearly explained.

1 Like

Also now both optix acceleration and the optix denoiser work on non-RTX cards, in some CUDA is faster and others optix is the faster one, the oldest card that can use it is the GTX 750.

So why can’t Open CL do this? Is it a limitation of the API or caching has not be coded by the devs?

It can, Luxcore now does that for OpenCL and CUDA on first run. It could be done for Cycles as well, it means every parameter has to be an adjustable variable in the kernel rather than passed as a compile option. That was the changes LuxCore had to do to remove per render compilation.

Even on a GTX 1070Ti Optix is pure madness to work with.
You see a lot of details lost initially but for exploring lighting for example in interior renderings this is killer.

2 Likes

but didnt AMD even help with Cycles? LuxCore now starting to also look into CUDA kinda is great news too. It seems it is just a more optimised technonoly unlike openCL.

There is also talk about LuxCore getting metal API support. I am curious if this also would be for Cycles since openCL on macOS is being phased out.

The impact of switching to a ‘mega kernel’ did have a performance impact on simple scenes that use few materials/textures, the once off compile takes significantly longer than the conditional compilation did. For complex scenes the performance difference is minimal when most code paths are used.

Not compiling each frame that had a material/texture change also saves time when rendering frames that require a lower sample count.

I hope with the addition of ray tracing hardware with RDNA2 AMDs OpenCL 3 (1.2) drivers are improved, at the driver level not just by assisting with coding the applications.

The problem I see often is that need to recompile the kernel and it takes a long time making openCL anyway a no go for my students.

This kinda sucks because LuxCore is seriously amazing now.

1 Like

Even on my 970 it’s amazing (although to be honest I don’t know if it’s using CPU or GPU :smiley: ). Nvidia have done an incredible job, and the fact they went back and added support for old cards, and all the other cool features they’ve added to the RTX cards (RTX voice and DLSS 2.0, for example) since launch makes me feel like I’m much more likely to get another Nvidia card than get an AMD one, unless AMD bring something pretty special to the table with RDNA 2.

Thanks for that.

I’ve never used Cycles in OpenCL but I’m open to switching to AMD if Big Navi delivers. But the ballache with Kernel compilation has dampened my enthusiasm. Does the kernel need compiling every frame if parameters change?

If it’s true that new AMD cards will come with raytracing support, it would be nice if Blender and AMD devs work together for Cycles to support precompiled OpenCL kernels. This is really a point that matters and is taken into account by many of us when thinking about choosing a card for Cycles.
Anyway, I’m not sure if this forum is the best way to make the request visible to AMD devs.

1 Like

Eh, Big Navi isn’t going against the current Nvidia cards, it’s going against Ampere/RTX 3000. Which is rumored to have pretty substantial raytracing improvements too.

I know exactly which GPUs Big Navi is going up against thanks. I have no idea how you could’ve interpreted my comment that way.

Hardware ray tracing can only speed up the ray intersection part of the path tracer, most production scenes are still shading limited. So even if the RTX 3000 series RT hardware is many times faster than Turing the actual difference in render times is likely not going to be that significant.

That is actually interesting to read and be aware of.

Honestly I think the biggest time cutter for rendering currently simply are the denoise functions.

If you have a scene that is 50/50 intersection limited and shading limited an infinitely fast ray tracing hardware can only speed up the rendering by 50%. The scene still needs to be shaded. RT hardware can make a big difference if the software intersection engine is not as good as it could be Redshift’s was excellent which can it a huge advantage over other renderers. Cycles benefits greatly from Optix so the gap has closed.

I’m talking about path tracing here, of course game style raytracing will benefit from as much RT as possible.

Denoising tech and RT hardware have completely robbed Redshift of it’s unique selling point, its speed. ECycles is more than a match for Redshift, normal Cycles with the recent adaptive sampling and denoiser is now much closer to Redshift and Cycles nodes system is vastly more powerful to Redshift’s.

There’s much more to come from Machine Learning too especially upscaling where we could be rendering 1080p but upscaled to 4k it’s visually indistinguishable to real 4k. Then there’s checkerboard rendering where every other pixel is rendered and ML fills in the gaps, I understand the next-gen game consoles make heavy use of this technique for their ray tracing.

Machine Learning is perhaps more interesting than ray tracing hardware in the pursuit of reducing render times.

2 Likes