Intel EmBree Path tarcer/Ray tracer Source - Why not use with Blender

Hi Guys, Ive started getting very interested in path tracer/raytracer setups at the mo (realtime/non realtime) and came across the Intel Embree engine. This engine is open source even for commercial projects and after a few tests is stupid fast even on the CPU.

Link to description and source code VS source code:

Would anyone else think this would be a good addition to Blender.

As far as I remember, Cycles already uses some BVH-related code ported over from Embree. Maybe in the future some more stuff can be re-used, but unfortunately render engines aren’t just “plug-and-play” :wink:

Ahhh, just seems so quick against Cycles. Im trying to come up with a new way to render with ray tracers with a pre scene evaluation of shadow volumes turned into primitives that sample only edges then interpolate between those samples at edge points to cover large surface areas with a gradient shadow. Hope is rays only need to be calculated at 10% of normal, but still work in progress.

It seems quick because it is not fully featured… cycles has fully hdri lighting / texture support / motion blur / hair support (now) etc.etc.

Im not bashing Cycles, love the work thats been done. Im just thinking another renderer for Blender can’t be a bad thing. And as far as i know the aug release had HDRI,Motion Blur, Depth of field, Physical camera model, im not sure about texture support as only played with for a couple hours but i have code i wrote a while back for texture support in my first raytracer that will work just fine if not. Variety is the spice of life.

If you want another raytracing engine to be integrated in Blender, I think SmallLuxGPU is the only logical candidate. It is incredibly fast, and development is going at an amazing pace right now. It is already designed to work well with Blender. And the best part: no CUDA! :smiley:
Embree would also be good, but in my opinion it is not as feature complete as cycles or SLG, it is just fast.

Depends if you’re on the giving or receiving end :wink: Some developers might disagree.
Besides that, the new Tesla K20 offers a lot of benefits with cuda, like 32 MPI tasks, or that it can spawn it’s own new kernels, thus allowing recursion.

However, it’s AMDs big chance to get a foot in the door of professional studios again if they play it smart. All they have to do is polish their OpenCL drivers and stack VRAM in the 2 digits on their cards.

While the current Quadros leave the latest FirePro dead in the water in viewport performance, the GPGPU sector tells a different story.
For double precision, the professional cards from Nvidia win the race each and every time, but that’s only needed in scientific calculations we CG artists don’t care about.

In single precision for instance, in LuxMark 2.0, a Tesla K20 for 3300 Euro with 5GB VRAM scores 238 samples/sec.
A FirePro9000 for 3100 Euro with 6GB VRAM scores 1914 samples/sec.

Xeon Phi is interesting too, especially for developers. If you run it in offload mode, you just need a #pragma offload target (mic) directive and the compiler does the rest to feed the coprocessor.

It remains to see what’ll happen in the near future. The problem currently is that the hardware manufacturers are cooking faster than the developers can eat.
One might bet on CUDA the other one on OpenCL (which became as religious already as Direct3D vs. OpenGL) to develop something within one or two years to find out at the end of the period that it was a waste of time, because now XeonPhi is the ring to rule them all. And while reading into it, the next manufacturer builds the AwesomeTrace9000 card…

No thech guru here, but the more logic option imo would be taking other pieces of code from Embree to add to Cycles where it needs, as already done by Brecht btw. Also chaos group Vray has taken benefits/boosts from using parts of it.

I can’t see particular benefits from having another pathtracers for Blender which wouldn’t be well integrated as Cycles.

All things being equal, no. However, development resources are very limited, so actually it would be horrible, because it would take time away from stronger and more useful features. Especially this one, which is so similar to Cycles anyway.

I don’t know much about coding etc, but why does a program start running slower on a certain task if more features are added to it? It has puzzled me why newer versions of Cycles with more features are slower compared to older ones on the same simple scene…

Because the program has to check at runtime to see if the code is being used. Anything that’s not commented out in the code still has to be interpreted at some point.

Adding features to Cycles code inevitably leads to added branches. Particularly on GPUs, branches are very, very slow which might also explain (to some degree) why Cycles runs slower on GPUs if the hair code is added.

Because the program has to check at runtime to see if the code is being used. Anything that’s not commented out in the code still has to be interpreted at some point.

That isn’t true. If a particular section of the code is never reached, it will never be executed. However at runtime you sometimes have to check which section of the code to jump to. If that is done using branches, that may involve checking for more and more conditions the more features you add. The alternative would be to pass the section as an argument itself, by using function pointers. However, GPUs do not support function pointers and on the CPU, function pointers are unlikely to be faster than branches.