Thanks to another thread for asking about GPU+CPU combo render, it forced me to do some testing if I can repeat the issue where our memeber didn’t see the performance improvements when mixing CPU+GPU.
Seems there is a GPU Tile Performance drop.
Expected that GPU tile settings no longer make a difference in performance, but it seems there is a regression at least for OpenCL in 2.8?
Can anyone else test and post their results of just GPU tests for various Tile settings?
So if you can BMW 2.0 (two car version), post version of OS\Blender\Render path (openCL\CUDA)
UPDATE Definitely an Issue with OpenCL.
Either Blender 2.8 or AMD drivers … still investigating
Screenshot below shows Single then Dual GPU usage. High usage is with Single GPU, lower usage with two GPU’s. Device usage drops by nearly 1/2 for each device… Also confirmed via GPU-Z…
Still current testing: Windows 10 \ Blender 2.8 \ OpenCL
Please check above, it “SHOULD” work but it doesn’t, indicating a potential bug in Blender. hence ask for others to test and report if it is a Blender 2.8 or OpenCL issue only.
As you can see in my table i tried on various tile settings, and 32x32 is worse then just my 2 Vega’s at 128x128… and that should NOT be the case.
Vega’s at 32x32 are over 1 minute vs, at 128x128 they are just 26 seconds. adding CPU to the mix is worse on all settings.
29 sec. is a good result. Performance can not measure with lite scenes. If you want measure performance, then use heavy scenes and take render for long times, like 5 minutes, 10 minutes, or 30 minutes…
Because, Cycles first calculate nodes, vertices, etc. with CPU and after that begin render.
I can get approximate 2x fast render with Blender 2.80. But I can not measure this values with lite scenes.
I just gave the bmw benchmark some quick runs.
5960X at 4.2GHz cpu benchmark ~3 1/2 minutes (I didn’t keep notes on that)
gtx 1070 256x256: 1’ 48"
128x128: 1’ 33"
64x64: 1’ 29"
32x32: 1’ 28"
16x16: 1’ 29"
cpu+gpu 32x32: 1’ 10"
Thanks. So you have expected results of actual render improvements on the GPU wiht smaller and smaller tiles… And better when CPU is added.
Ok, i’ll try to downgrade my drivers for AMD and see if that is the case. I expect that is as the new RX 5700xt had OpenCL issues with last drivers and now they “patched it”.
I have tested couple of demo scenes with 3700x and Radeon VII and compared them with the results mentioned below:
They were a little faster in my tests, very close though.
But, regarding to hybrid rendering, it doesn’t work in every scene I suppose, especially for OpenCL.
My results:
BMW
GPU: 74s
GPU+CPU: 72s
CPU: 173s
Classroom
GPU: 80s
GPU+CPU: 122s
CPU: 255s
Pavillon
GPU: 194s
GPU+CPU: 127s
CPU:270s
Maybe we should report this issue. I am not totally sure.
You cannot measure render performance with small render times like these. Because, Cycles need preperation time before start rendering. This values always mislead you. Example, if you get 72 sec, then maybe 30 sec used for preperation, for another render maybe used 40 sec, who know? This preperation times are variable and mostly use CPU. You must use heavy scenes and take long renders for true results (minimum 5, 10, 20 minute etc)…
Can’t you see estimated preparation time? Certainly not more than 4-5 seconds for the demo scenes. I understand your concern however I got similar results even if in some of my projects.
First, small render tiems, that still gives way more information and permitts various retests. So the BMW 2car scene is sufficient, as I have already confirmed that dual GPU is broken.
Now main thing is to test at various tile settings.
for me I tested as following
Started first run, but ignored the inital result as the kernel required compilation time. All following renders (while Blender is not restarted) will not recomplie the kernel.
Single and Mulit GPU at 256x256, 128x128, 64x64, 32x32, 16x16 tile settings.
Expectations are that render times should be similar.
Reality, noticable slowdowns
This has direct impact on CPU+GPU render mix.
So, you can FULLY get sufficient measures of Reder Performance with these scenes.
If I get more confirmation I’ll be raising a ticket this week.
So, this is a good example of correct scaling as Birdnamnam had (using CUDA) vs what I’m seeing (using OpenCL)
@egementncr, can you test just BMW at the 5 settings below on your Radone VII. Just GPU testing. I just need a confirmation if you get worse and worse scaling like me or if you have the correct scaling.
This not show bad scaling. These are two different graphics cards, different architecture, different capabilities, different chipsets… What is Vega’s 16x16 texture processing capability? Or what is CPU and GPU data bus capability? Vega use HBM2, Geforce use GDDR5 graphics memory. These are different architectures, not same.
few months back, my scaling on my VEGA was nearly identical to the GTX 1070 listed above.
Blender’s internal testing showed Vega on par with the GTX 1080
You are not providing anything of value to the post, and stating incorrect facts, potentially indicating a troll. If you are unsure about what you ware writing, I recommend you do not, else I’ll flag you to the admin.
2.79 builds after “C” release had the CPU+GPU available since late 2017, and it worked perfectly, way before 2.8 was released.
And Alpha/Beta 2.8 when I tested also worked perfectly on scaling
This indicates that “recently” some optimizations were done in blender 2.8, were a bug was introduced that is causing significant performance REGRESSIONS for OPENCL compute rendering.