HELP > Rendertimes > multi-GPU-testsetup & frustrating GPU/CPU-Load

Dear blenderartists community!

Got some Blender-predicted rendertime issues. Generally speaking when it comes to CUDA-Rendering; more cores > less time, you may think… Eventually the predicted rendertime, never corresponds the actual rendertime (in my case). It always turned out to be five to six times higher than the prediction (like: prediction 12mins > actual 1:02 hrs)

I was experiencing that in the beginning of a render, the GPU(s) was/were really working heavy-on-load (like both on 99-100%) but at some point in the render, cycles decides to drop that, and from that time on blender is no longer able to meet the predicted rendertime (whatsoever) and it’s taking (felt, like) forever > like: Remaining: 00:03:20hrs will actually be fiteen/twenty minutes remaining.
The same amount of time it took at the start of the renderprocess to render a big part of the image (many renderbuckets), from that breaking-point on takes veeeeeery long to render an equivalent part of the image, where the engine before took like a fraction of that time to render and the GPUs-Load will even fall on 0% (see images below). Instead the CPU keeps rendering.

Generally, I tried several tile-sizes to find out the “best” I should go with, starting with: 64x64, 128x128, 256x256 and found that with 128x128 my system will have the best trade-off between CPU/GPU. I haven’t yet tried any rectangle buckets like 128x256 or 256x512px though.

I set up a little test for different GPU-rendertime testing:

Setup_1) Titan Xp (actual rendertime: 01:07:08:85hrs)
Setup_2) Titan Xp & Geforce GTX 980Ti (rendertime: 01:02:22:44hrs)
Setup_3) Titan Xp & Quadro P4000 (rendertime: 01:01:12:26hrs)

What I realized after I did the Tests of the following scene, was, that the rendertime didn’t scale whit the amount of GPU/CUDA-Cores, that I put into the system. As with Setup_1 (TitanXp) the actual rendertime didn’t really change to Setup_2/3, like I was the increase expecting to be. > if it’s bout CUDAcores only, then why is the combo with a P4000 a little faster then 980Ti?

Render-settings:
2048 samples
TileSize: 128x128px
1920x1200px
64passes (like half-GI?)
AO

and all the other jazz you can find in the image below:

If it’s useful, running on following system:

  • ASUS ROG Strix x99 MoBo
  • i7 5960X nonOC (Dark Rock Pro / Dual-Fan-Cooling)
  • 64GB (4x 16GB) Corsair Vengeance LPX DDR4-3000 DIMM (CL15-17-17-35)

GPUs:

  • Titan_Xp (3840 CUDA Pascal 12GB) [PCIe-Slot1_x16] (FoundersEd. stock-cooler)
  • EVGA Geforce GTX 980Ti ACX 2.0 SC+ (2816 CUDA Mxwl - 6GB) (PCIe-Slot2_x16)
  • Nvidia Quadro P40000 8GB (1792 CUDA – Pascal 8GB) [PCIe-Slot2_x16]

Nvidia WHQL-Driver: v432.00

Problems that there possibly might be:

  • I was askig myself if it was an user/rendersetting-issue?
  • if it might have to do with cooling/heat-problems?
  • cooling/heat-problems > actually the system is a „non-water-cooled-build“
  • Pascal vs. Maxwell-Arch. > not working well together?
  • Driver Issues? > running Geforce/Titan-Cards together with Quadro Cards I only know of one driver combination: WQHL_v432.00 therefore it’s the only one i have installed and I haven’t yet had the time to try it with other driver-setup.
  • PBR-Materials? > stretching rendertimes

Any bottlenecks you might know about, or fundamental render-settings/errors you guys see in the challenge here?

Help will be much appreciated!!!
Thanks in advance,
maveric

anyone, please help.

Try to make it more systematically.
Create a testrender, not too complicated but enough to see differences.
Then check it out with different core numbers, only CPU, GPU, etc
Notice render times and temps of CPU/GPU
Put it all into a spreadsheet to have a good overview

You actually have no problem whatsoever.
The predicted rendertimes have no actual meaning because the engine does not know what load will come on future tiles ( in order for it to know, it would have to render a prepass which Cycles doesnt do).
There simply is nothing wrong on your end.

As for the utilisation part:

You are rendering cubes against empty space. Thats why the load varies.

Just build a cube around it as a test and see the load rise.
But still, the rendertime estimation is nothing but a guess and can be ignored.

For the hardware itself: Not sure why you have a Quadro but if you were to sell it ( and the 980TI) you could most likely get a single card with 11 or 12GB of memory which would be as fast as both together and produce way less heat.

THANKS for your hint! > I’m currently thinking about a Titan RTX in exchange of all GPUs.

Big thanks for your replies, ideas and help @anaho & @mahol!!!