Intel's new Haswell chips: Optimizing Blender itself to become ever more important

Because after stumbling on this.

I can’t help but think of the possibility that the days of speeding up Blender via Moore’s law may be drawing to a close. Now it’s always subject to change when you consider that the nanometer scales will be going down a bit more over the next several years, but unless Intel decided to increase the focus on more cores and faster transistor/chip/material designs compared to increasing the power of the onboard GPU, there may not be a lot of hope for a major new leap in speed unless we really get technologies like graphite transistors, nano-tubes, and optical computing off the ground and into products. Granted, this is just initial information and it hasn’t been thoroughly tested yet, but I wouldn’t be surprised if Intel’s not far off the mark in speed.

So it looks to me like the primary increases in speed in software over the next few years will depend more on optimization techniques than the number of transistors on the die, especially when looking at the fact that Haswell boasts another major architectural change yet the speed increase is expected to be a relatively small 10 percent or maybe a little more. (and this is the ‘Tock’ part of Intel’s release cycle). Right now it seems like the major thing for speeding up software is newer, faster instruction sets, and to optimize Blender that way will mean having to slowly drop older sets over time much like how Ton made the decision to drop the original SSE sets.

Thoughts?

I agree about more instruction usage but not droping the old ones so easily. There should be a fallback mode for older ones.

  • Add support for newer instrucions
  • Make optimization for AMD and Intel instruction set.
  • Add OpenMP and OpenCL support to every tool and simulation if possible.
  • Use OpenCL to run on GPU as an option and/or fallback mode

Maybe in a year or 2 some group of sponsored developers could do it. If Ton would agree we could start a donation for “run Blender run !” project.

Single-Core performance has been increasing only marginally for years, that’s not really news. Intel also offers 10-core server chips, with 15-cores on the horizon, they just don’t make a consumer product like that.
Only some areas of Blender can parallelized, but more importantly, Blender still is running most of its tasks blocking, which can make it a pain to work with.
The problem is, concurrency is hard, especially with C/C++. There’s a reason that a lot of major applications suffer from this issue.
As far as newer SSE/AVX instructions go, unless the compiler can automatically use them, they’re probably not worth the effort dealing with. They aren’t useful in that many cases, anyway.

And i heard it helps alot with few things. Rendering, simulations, rigging and animation speed will increase(sure, maybe not to marketing values of 2-2,5 times but still) and the newer instruction of avx2 will increase speed even more. And this is just AVX, maybe other instructions will be useful as well.
Code that uses OpenMP gains around 30-50% speed improvements, unless Blender devs will code everything perfectly by hand which is doubtful as some tasks right now don’t use all cores at max. Then we have OpenCL that made som soft around 10-30% faster.
And what about automatic handeling of of instructions by compilers ? How can they use something if its not in the code ? All compilers sooner or later support new instructions. GCC for example always has support of new instructions way before Intel/AMD make their chips to the market. Even Microsoft’s VS supports them relatively fast.