less recompile for split kernel

Hi,
The biggest problem of the OpenCL/Split kernel now are it’s very often and long compile time. This patch is meant to reduce the number of conditional features and nodes to a minimum, while keeping performance on the same level. Master has 4 node groups, 4 node features, and 13 kernel features, which gives 4*2^17 = more than half a million possible combinations. A great improvement compared to 2.69 which had 64 possible closure numbers on top of that :smiley:
Anyway, this patch reduces this to a point where you should rarely see a recompile.

Transparent shadows, principled shader, denoising, camera motion blur, 2 node features regarding bump are now in all kernels, so you can use those smoothly in your work.
The number of node groups was reduced to 2, which means that adding nodes to your tree should only trigger one recompile max.
You can test a build of current master with less recompile and faster rendering for the split kernel here:
File-Upload.net - lesscompile.7z
Here is the impact on different scenes from the benchmark pack with some tweaks to compensate the very slight slowdowns due to more features being compiled in all the kernels:


The patch is here https://developer.blender.org/D2939. Note that it’s a big wip, review will start later if you are happy with the changes.

If loosing perf is acceptable to you, I could reduce the number of recompiles even more.

Hard to tell from your bar graphs, but it looks like a max drop of 5% performance? I’d be more than willing to lose that by default if it meant compiles happened less often. Many of my renders are 10mins with a compile of about a minute. That equates to 10% perf/time lost.

The current build in post 1 is faster than master actually. If enough people ask for a smoother workflow, I’ll do a new build.

Tested on NVidia GPUs (gtx1060 & qm5000) & Intel CPUs (2x X5650).

After first compile it felt smoother (sorry, didn’t record times to compare to official).

Also got couple of issues:

  • denoising using both GPUs always crashes (larger the tile - sooner the crash)
  • CPU intel X-5650 (OCL1.2) produces artifacts with transparency/alpha/shadow catcher (no OCL issues in Lux, RPR, Indigo…)

I’m pretty sure those bugs are also in master. Could you run the same test with a buildbot? If it’s the case, please report the bug on tracker developer.blender.org . If it’s only my build, I’ll have a look, but I didn’t touch the denoising code at all in this case, only made it available in all kernels.