Hi, i am currently work on my thesis on 3D-Renderfarms and do lot of researches in any direction. So this came up during my researches of threads and their speedup imapct.
When it comes to threads, blender will verify Ahmdahls Law pretty accurate. The first 35 % of threads will have something like 80 - 85% boost impact. The last 30% of the threads will boost your render like 8 - 10 %.
So if you have a CPU which natively supports 32 Threads, the last 15 threads are adding not so much to your render speedup.
That means, for all who work and render on the same machine, a decrease of the threads by 1 - 2 will not hurt your render and still let you work fluently through your day (depending on your logical processors).
Ah yes, i made a line of tests with different machines. A HP Compaq 8300 Elite Ultra-slim Desktop, a HP Workstation Z420 and a HP Z840 Workstation. Respectively an i5, i7 and a E5.
I let them crunch through a bunch of scenes and settings. The individual results where different but the graphs were the same.
The Z420 showed of course a lot better resoultion, because of his native 32 possible threads.
I’m fairly sure your observation has very little to do with Amdahl’s Law and almost everything with the fact that hardware threads don’t necessarily map to actual hardware resources 1:1.
Many Intel CPUs employ Hyperthreading to pull in maybe 10% in extra performance by having some parts of the CPU pipeline duplicated. AMD has its Bulldozer “modules” share one FPU per two “cores”. In either case, the actual floating-point throughput remains the same, whether you use the extra threads or not.
You will find that for something like path-tracing, the serial part of the program is vanishingly small and it scales extremely well to multiple “actual” CPU cores, with memory pressure being the next bottleneck in line.
More generally: Hyperthreading et al. adds little to your render speedup. For maximum performance, you should use at least as many threads as you have “physical” or “real” cores, which depends on the CPU.
There are limited single threaded operations… they mainly are image loading, file saving, bvh loading and compositing steps… usually only taking a percent or so of the actual render time.
as many others have pointed out, it is hyperthreading which does not help (much, last time i looked into it it gave a 5% speed boost) with raytracing. as raytracing in general, as a algorithm, is extremely good at multithreading.
if you want to test this… then if you limit blender to only use 16threads per instance… load two up and render an animation… record how long it takes (number of hours)… then set blender to use 32threads per instance and render… and record how long it takes… and now render just off one blender instance with 16 threads. they all should be approximately the same, maybe a difference of 5-10%, but nothign drastic.
If you are doing a thesis on this, you may want to do more research into hyperthreading and how it impacts different applications before making assumptions.
They have a dual core ARM chip as well as a RISC multicore chip. the ARM chips are like what are in your phones / tablets… they are not x86, but still have decent performance & functionality… risc are used for very specialised calucations… not sure how muhc that will improve performance… or whether it woul dbe worth while porting cycles over to it
you can’t use cuda cores to render blender internal right? (wouldn’t that be really really fast?)
Nope… BI is not written in CUDA… It is not a matter of just running it through a interpreter… it would need a rewrite to make it possible… which then you would end up with people getting frustrated with the code spaghetti that BI is, and end up saying lets just start over with a clean architecture and a modern code base for expandability for the future, and then you end up with something like cycles from 2 years ago.
My current scene renders on cpu with hyperthreading in 6:51:09 and with hyperthreading disabled in the bios at 9:13:91. I calculate that as: no hyperthread is 35% slower than with hyperthread or if i reverse the calculation hyperthread is 26% faster than no hyperthread.
The memory bandwidth is only an issue when both of the paired hyperthreads request access to memory at the same time. Since each tile takes a different mount of time the collisions are reduced.
While 8 actual cores would definitely be faster than 4 hyperthreaded cores there is still a significant boost using hyperthreading.