Cycles Nvidia Ultimate Benchmark Spreadsheet

Hey guys… I was curious about the performance of recent cards on blender, in comparison to the old and famous gtx580s that were so great on cycles. This sheet only contains Nvidia Cards starting from 440s, and including Quadros.

To have a clear picture I used the old benchmark spreadsheet that is still being used after 5 years and is the most complete source of information about various card models, cpus, OSs. I made an average of up to 16 results (with different configurations) for every card, so that we could have a general notion of how fast it is to render the scene and placed the results in a graph to be better visualized.
The only card that is not there is the (now)recent 1080Ti, but I asked another blender user to benchmark it :).
There are probably some unrealistic results mixed in there, specially with cards that had only a single test made (so no average there).

This is the spreadsheet on gdocs:

(not sure if the image will be visible, open it in a new tab with scroll-click)


Some curiosities:
-580s were launched by Nvidia in 2010. Now it’s 2017 and they still have better performance than even Titans and 1080s.
-Someone tested a rig with 6 gtx680s. And they still didn’t outperform a 4x580s SLI.
-Quadros in general apparently suck for CUDA rendering. They started getting better with K5000 and upwards, but you can buy a TitanZ for half the price and get around 12% faster renders.
-460s and 470s performed better than a lot of recent cards like 970s, 750Tis, etc.

Hi, this was a lot of work for sure. :slight_smile:
The problem here is the results of the older cards are mostly done with older Blender versions.
These older versions had less features and therefore they were a lot faster.
My benchmark with only 2.73 showed this > https://blenderartists.org/forum/attachment.php?attachmentid=370114&d=1427653408
Anyway, the GTX 580 are still fast cards even when they sounds like a helicopter. :slight_smile:

Cheers, mib

Totally true… disabling volumetrics, hair, baking and cmj makes the gtx580s kickass cards… just with limited ram.

We still do this to make the 1080s even faster :wink:

Just remember, that the 5xx series and below are end of life for CUDA toolkit support… expect compatibility for them to disappear from blender when the next toolkit drops.

Hi, thanks for the feedback!
I’m not so sure about your argument though… First, we would have to assume that cycles got slower with time, which as far as I know, didnt, on the contrary we just had a performance update on it. It got new features, true, but you forget that the file used on that spreadsheet is only basic meshes and basic shaders, glossy and diffuse that are present ever since cycles was programmed. There is no SSS, Volumetrics, etc, things that would slow the render down.
Or are you saying that even on those basic shaders the cycles renderer still got slower from blender 2.6x ? This would be a massive retrocess… things are supposed to be optimized and get faster, not slower.

And another thing, like I said, to get those render values I made an AVERAGE of up to 16 different render times, that among other things (like processor, OS and memory), use different Blender versions. For example, for the gtx580 these are some of the results that I averaged:

4x580 EVGA - Blender 2.70
2x580 gainward - Blender 2.75a
580 Asus - Blender 2.62
580 Asus - Blender 2.68
580 Asus - Blender 2.69
580 Club3D - Blender 2.66
580 EVGA - Blender 2.67
580 EVGA - Blender 2.63
580 - Blender 2.68
580 MSI - Blender 2.77
580 - Blender 2.78

And for the Titan, Titan X, and Z, these were the blender versions:

4xTitan Black - Blender 2.76
4xTitan - Blender 2.70
2xTitan - Blender 2.66
TitanX - Blender 2.74
Titan - Blender 2.67
Titan Black - Blender 2.73
Titan Black - blender 2.74
Titan - Blender 2.68
Titan - Blender 2.71
Titan - Blender 2.75a
2xTitan - Blender 2.69
2xTitanZ - Blender 2.71
2xTitan - Blender 2.68
TitanX - Blender 2.71

So as you can see, there is a plethora of different versions, and both old and new cards have a fair share of old and new blender versions, which give us a general picture of how much one performs better than the other across systems and versions. Of course the better would be a static bench with with all the nvidia cards and all the blender versions to be consistent (even though it wouldn’t then represent different hardware advantages on different cards), but since that is impossible, we do what we can with what we have :smiley:

But with that being said, your benchmark is super important, I’d go as far as trying to add those results to the old spreadsheet to have even more data there.

Do you own all these cards or did you rely on other users?

Cycles did get slower over time. The kernel was much smaller, which by itself makes a difference. Some changes also caused performance regressions, e.g. intersection is now better (“watertight”), but slightly slower (iirc). Some optimizations here and there (many of which only affected CPU) did not outweigh this.

You can either trust several people telling you this, or you can perform your own tests, all versions of Blender are available here.

Secondly, unless you increase the default tile sizes, newer GPUs will take a massive performance hit. There often is no telling from the spreadsheet what settings have been used.

580s were launched by Nvidia in 2010. Now it’s 2017 and they still have better performance than even Titans and 1080s.

This is not true. The 680 was a regression from the 580, but the 780 already was more than 50% faster than the 580. The 1080 and Titans are much faster still.

Again, you must perform the tests with sufficiently large tile sizes (256x256+). In earlier versions, tile sizes were calculated differently, so the problem didn’t exist.

There was a known bug in Blender with Titan an Ti cards in some of those versions though - a bug which resulted in massively increased render times. Some of the benchmark threads discuss this.

The bug means that any average that includes those blender versions will be unfairly skewed.

Here’s some (incomplete) evidence for Cycles getting faster over time on CPUs and GPUs:


https://brechtvl.github.io/cycles-ci/

Comparing GPU models is difficult though, you need much more carefully controlled benchmarking to get good data there.

The Cpu is also important. Weak cpu becomes a bottleneck for strong gpu during rendering. All I know is that the stock gtx980ti paired with stock i7 6600k (I use it at work) is ca. 2.5 times faster than my factory OC-ed gtx580 3gb working with OC-ed i5 2500k (4200 MHz).

I might suggest making the titan’s clearer. We’ve now got the original titan x(maxwell), a titan x(pascal) the newest a titan Xp.
I’d go with something like:
Titan Xm
Titan X
Titan Xp

Frikken Nvidia…

Are you sure Cycles is getting faster? I’ve tested a very simple test scene (just 2 suzanne models), comparing an old version of Cycles from 2011 (v2.60.1) with a more recent version (v2.78) on a Geforce GTX 1080 and the latest Cycles is significantly slower which was a bit disappointing:

  • 3.5 seconds in v2.60.1
  • 6.8 seconds in v2.78

The test scene can be downloaded from here

Are new features causing the slowdown?

Look it does’t matter if Cycles is getting faster or slower. If you are comparing different cards running in different machines, on different OS’s, etc, you’re going to get wildly different and very unscientific results. This is not a definitive test by any means. You need to have all the cards tested in the same environment, on the same CPU, same memory, same MB, same OS, same Drivers, same Bender version, same, same, same. You’d be much better off borrowing those cards and testing them in your machine against the newer cards. Seriously, it’s interesting and amusing to see these results but take them with a huge grain of salt.

Or not… I’m sure you guys can sell your 1080 ti’s for a trunk load of 580’s and then you’ll know for sure. and then you can say to everyone, “Boy, they sure don’t make’m like they used to!”

Oh and what was the scene that they used to test? The BMW scene has has multiple iterations over the years. Single car, small tile size, all the way up to two cars with squared sampling. I’m telling you, this is a mess.

what makes 580 fast with Cycles then? more CUDA cores even compare to the new generation?

and I’m wondering would an upgrade from 970 to 1080 ti bring 3 times more power in Cycles on average?

@weee
Your best bet it to look around at all the comparisons, benchmarks and other sites to get a “feel” for the % gain.
e.g. http://gpu.userbenchmark.com/Compare/Nvidia-GTX-970-vs-Nvidia-GTX-1080-Ti/2577vs3918

Remember that older NVidia cards also have less memory, lower supported resolutions (e.g. some can only do 2k) and lose out on 10 series improvements in latest NVidia drivers/sdk’s.

It would be nice if Blender had a benchmark built in as people love to benchmark Blender releases. It saves your results for comparisons against newer versions or if you upgrade some hardware. The results table should be live just like 3DMark etc and include render settings, tile sizes, driver version, OS, CPU/GPU etc and allow you to pull up any old result and apply (if possible) the same settings to your current blender version. Without this everybody is running amok with different versions of blends and extra tweaks they forgot to mention to beat the card above it.

It’s not clear to me what you are comparing, since 2.60.1 does not support the GTX 1080?

This scene renders in 0.67s on an AMD Radeon RX 480 here, with latest master.

Forget to mention that I tested 2.60.1 in OpenCL mode on the GTX 1080, CUDA didn’t work. The time for v2.78 was in GPU Compute mode.

This scene renders in 0.67s on an AMD Radeon RX 480 here, with latest master.

That sounds more like the times I was expecting. I must be doing something wrong (probably picked the wrong device), will do another test with the latest master tomorrow.

I’ve done the test again, making sure I had the GTX 1080 enabled in both versions.

Still getting roughly the same results (timings exclude BVH building):

  • Cycles v.60.1 (Oct 2011, from http://graphicall.org/641): 3.35 seconds for 200 samples (using OpenCL)

  • Cycles v2.78: 6.69 seconds for 200 samples (using GPU Compute) (twice as slow as the old version)



test scene

If you had a GTX1080 it would not have worked with the first build of cycles. Back then sm_1 and sm_2 were only supported, not sm_5

EDIT, just saw it was OpenCL, not cuda. you will need to enable OpenCL as direct comparison, not CUDA vs OpenCL. Not to mention OpenCL right at the start was literally just raytracing & AO

I will test with OpenCL in both next week, although I would assume using OpenCL on a 1080 will actually be even slower than CUDA.

Hopefully someone with a Fermi card (GTX 480 or 580) can test the scene in both versions using CUDA for a better comparison.