Tested: Optix for GTX - The Good and the Bad

Optix for Nvidia GTX Video Cards is Here: The Good and the Bad

Optix has shown impressive acceleration in Cycles rendering (30-45% faster than CUDA in my testing), and in denoising, but these features were only available for Nvidia’s RTX cards with their ray tracing RT cores.

Now, Blender 2.90 now includes Optix support for Nvidia GTX video cards (actually any Maxwell or newer card). In Blender this affects two things for GTX owners: viewport denoising and Cycles rendering.

viewport-denoise-setting optix-render-setting

Optix AI-Accelerated viewport denoising is the most apparent. With very few samples, you can get a clean (though with painting-like artifacts) ray traced image, even on an older video card. This makes working with ray traced scenes much more accessible. GTX and RTX owners alike will love this feature. Note that Optix AI-Accelerated denoising uses only works with Nvidia video cards. The Open Image Denoiser (Intel denoiser) is also coming for Blender 2.90 (viewport and render process) and will work with any video card and CPU.

The other area where GTX can now use Optix is with Cycles rendering. In the Blender 2.90 alpha Optix rendering is now enabled for GTX-series cards. When Optix uses the RT cores in an RTX card, the gains are huge: 30-45% faster in my testing. Without RT cores, I was curious to know whether using Optix for rendering would have any benefit for GTX users.

For this test I used a Zotac Nvidia GTX 1060 6GB video card on an Intel H110 chipset motherboard with a Celeron G3920 CPU and 16GB of RAM running Ubuntu 19.10.

Setup

  • Asrock H110 BTC+ Pro motherboard
  • Intel Celeron G3920 dual core CPU
  • 16GB DDR4 dual channel RAM
  • Zotac Nvidia GeForce GTX 1060 6GB (PCI-e x16)
  • 800W Raidmax Gold PSU
  • Ubuntu 19.10 x64 with Nvidia GeForce drivers 440.xx
  • Blender 2.90 alpha (06/06/2020 build)

Method

I tested the Classroom and Junk Shop demo scenes at 50 samples per pixel and 500 samples with CUDA or Optix rendering. Denoising was off, and all other settings were the default of the demo files.

Caveats

  • As always, the scenes you typically work with will be different from the demo scenes, so you may see different results in your work.
  • This was tested with Blender 2.90 alpha. Comparing CUDA rendering times between 2.83 and 2.90 alpha did not show any significant difference in render times, but as 2.90 is in alpha there could be changes by the time it is released.

Results

  • Without RT cores, GTX cards do not benefit from any acceleration over CUDA when rendering with Optix
  • Optix rendering with GTX was generally slower than CUDA rendering, in the range of 5-10% depending on the scene
  • Only in the 50 samples Junk Shop scene test was Optix faster, and then only by 5% over CUDA
  • The number of samples had almost no impact on the results

Thoughts

The addition of Nvidia’s AI-Denoising in the viewport is very useful addition. It works well on GTX cards, and is a great feature to bring to Blender users.

Optix for Cycles rendering, however, probably should not have been enabled for GTX. In most cases, it makes renders slower. The performance hit is not great, so there isn’t a big risk to GTX users choosing Optix thinking that it will accelerate their renders, but there’s no benefit either. Perhaps there are some specialized situations where Optix rendering on GTX would make sense, but these benchmarks do not show that.

As usual, if there are more things you’d like me to test, let me know in the comments below.

8 Likes

Hi.

Are you sure about this? Cycles with GPU Only also uses CPU. On my graphical CPU usage monitor with GPU (CUDA or OptiX) in viewport, apparently CPU usage is very similar comparing No OptiX Viewport denoising vs OptiX viewport denoising.

Computers with multiple GPUs with GTX and RTX cards for example, I suppose it is possible to select them all under OptiX item in Preferences > System (Although I’m not sure that this is still implemented, I only have a single GTX card and I can’t verify it)

1 Like

Is that repeatable or just marin of error? Considering that others were worse I suppose it must not be margin of error but still curious.

I have a GTX 980 and a RTX 2070 so I can test using Optix with both. I am limited with the 980 as it’s also handling my displays so I lose speed and vram. I wasn’t able to render the Junk Shop as it requires more vram than my 980 has. I was able to render the classroom with these results…

Just the 980:

CUDA - 07:55.70
Optix - 09:28.27

Just the 2070:

CUDA - 04:43.32
Optix - 02:40.38

Both cards:

CUDA - 03:04.57
Optix - 02:09.78

I did notice that it takes a long time to load the Optix kernel into the cards, probably because it’s being compiled. I ran those tests twice and took the second time to remove the time it takes to process and load the kernel.

My system is running Linux Mint 19.3, Intel i7 5820k at 3.8 Ghz, and 32 Gb of memory. Blender version blender-2.90.0-281319653e5b-linux64.

2 Likes

I’m not 100% sure on the Optix denoiser using the CPU. I’ve asked the Nvidia dev forums for some clarification on how the fallback works. It’s my guess that if the GPU does not have RT or Tensor cores, then it would fallback to the CPU. I’m not sure how well optimized the stream processors are for learning in this case. It could very well be that it goes RTX->GTX->CPU.

And, yes, just like in CUDA, you can select a mix of RTX and GTX in Optix when choosing rendering devices. The Blender code in 2.90 doesn’t really differentiate between RTX and GTX, it now just looks to see if it’s Maxwell or newer.

I did run it a couple of times and got the same result withing a second or so. I’m not sure why that particular scene at that spp would be slightly faster, but in some early testing sebastian_k on Blender dev forums ran a test and got, “1 GTX 1080 on CUDA: 7:47 min and 1 GTX 1080 on OPTIX: 7:34” That’s a difference of 2.8%.

Out of curiosity I decided to try another scene, this one the apples tutorial scene from CGBoost. I tried it at 50 samples on a 1060 6GB:
CUDA: 4m30s
Optix: 4m41s

One other thing that I have not tested is changing the tile sizes. In testing these scenes it seems like 128x128 was generally optimal and I wanted to keep it apples to apples as much as possible so I used the same tile size for CUDA and Optix. Some people have suggested that larger tile sizes like 512x512 with Optix might be faster. I tried that with the Apples scene. It rendered in 4m43s, so it did not appear to make a difference.

1 Like

but in fact is it possible with a graphics card AMD?

I use GeForce GTX 1660 graphic card and running Blender v2.90 Alpha (2020 June 09) under Windows 10 64-bit.
Nvidia video driver version is 432.00

In Blender [ Preferences / System ], when I select Optix, the following message displayed:
No compatible GPUs found for path tracing. Cycles will render on the CPU

Does it mean that this graphic card cannot use the Optix function?

Hi.

It should work in Blender 2.90. You download the latest available drivers from nvidia site and preferably you choose to do a clean install from the installer.

No.
But it is in the plans to implement the possibility of using OIDN in CPU for Viewport denoising. Much slower, but at least an alternative for those who are not nvidia users.

Thanks for sharing this. I wouldn’t have expected much performance gain in cycles but it’s great we’ll be able to use for viewport AI_Denoising. Looking forward to testing on my own scenes, when I get a chance.

I updated the article to say that you need an Nvidia GPU for Optix denoise. I had read something about it working on CPU and thought it worked like CUDA’s heterogeneous compute mode, but that was wrong. According to the Nvidia devs on the dev forum Optix denoise will not work on CPU. It will only work on Nvidia GPUs. If Optix cannot find Tensor cores, it will just use the stream processors.

1 Like

On Windows you need driver version 435.80 or newer. On Linux is 440.59 I believe. Try upgrading your drivers

Unfortunately it will not work with AMD cards. Hardware ray tracing is coming with AMD’s RDNA 2 GPUs later this year, and with it there will likely be a similar solution. Meanwhile, the Open Image Denoiser (OIDN) is being added to 2.90 but is not yet in the alpha. It will work on any system because it’s CPU-based. For a lot of people it produces higher quality as well. The downside is that it is slower than Optix. A fast CPU will still be slower than a GTX 1080, for instance.

i m have 5500xt !
is good for render whith blender?

Got the latest video driver (version 446.14) and it works for Blender v2.90.
But it seems only a small portion of the GPU capacity was used (about 5% ~ 10%) and the CPU is running in single thread.

Note sure about that one. Did you restart after the install? I assume that you’ve set the Cycles Render engine to Optix in the System Preferences in Blender? Also note that it may take a few minutes for the Optix kernels to compile before it starts rendering. Are you seeing low GPU usage even when the tiles are being rendered and you can see them in the render window?

The 5500 XT is about the same performance as a GTX 1660. It’s not as fast as RTX cards with their RT cores, but yours is a good card for the price, especially if you got the 8GB version. It will probably be 10x faster than your CPU. The only downside is that you cannot use Optix denoiser in the viewport, but soon you can use the Open Image Denoiser with your CPU. If you want more speed, you can upgrade to a new card in September or October when new cards come out. AMD’s next generation cards (RDNA2) will be much faster and have hardware ray tracing.

Also, you can always use SheepIt, which is a free renderfarm, if you have a project that is too slow on your card.

1 Like

how it works SheepIt?

what’s this?

That’s not true - according my own testing at least.

If you are rendering using tiles (which is necessary for larger scenes or high resolutions on cards with lower GPU memory), Optix can give some very significant speed increases over CUDA, especially when coupled with adaptive sampling.

I did find that the benefits are scene specific though - so in some cases CUDA might be better, whereas in others Optix might give an advantage - depending on the number of tiles. I found that Optic was always comparable or faster than CUDA when rendering with between 12 and 48 tiles.

but now i’m trying GPU rendering with my AMD graphics card and it only uses 5%