OSL only runs on the CPU so Brecht designed a second shading backend that also worked on GPU because Cycles was intended to run on both CPU and GPU. This second backend got virtually all attention when developing Cycles so the OSL backend was kind of in an orphaned stage.
(Thomas Dinges)
So can this (“.second backend…”) be used to script your own shaders for cycles GPU rendering? What is it? Am I getting this wrong?
If you’re interested in writing shaders, you can use OSL. Of course, SVM was designed to be so flexible and modular that writing shaders is only really necessary for more advanced effects.
SVM, the Shader Virtual Machine, is what you’re using when you’re stringing nodes together in the node editor. Only SVM runs on GPU and CPU. OSL is CPU-only. You can’t script your own SVM nodes.
My only interest in that “second” approach was due to writing shaders, or to be more precise procedural textures, that work for GPU rendering as well. Guess I’ll have to wait and practise OSL in the meantime.
Right now the major limitation of SVM and nodes is that you cannot express loops. Things that are easy in OSL, GLSL, or any other shader language, are very hard, or next to impossible with nodes and the SVM. Unwinding loops manually with nodes is a pain, and generally uses up all the SVM stack space for anything more than a few iterations.
Until the SVM gets loop constructs, I would consider it feature incomplete. It is unfortunate considering the speed boost you get from GPU for procedurals.
I’d say try implementing it yourself and see how it goes. GPUs don’t really “like” loops, especially if they are of a size that isn’t known at compile time. The compiler will try to unroll them and that can cause your code to get quite big. I wouldn’t say loops are “easy” in GLSL, you can easily run into some limits there as well.
Once you have your loop construct, what are you going to do with it? Chances are, you will want to run another SVM (sub)graph in it. So you have an SVM evaluation function (that the compiler will try to inline), with an interpreter loop (that it is going to try and unroll), which now can contain calls to another SVM evaluation function that contains another loop… you’re asking for the maximum amount of trouble. You’d also run into a lot of interpreter overhead and have high divergence, which is bad news for performance.
Maybe it’s actually not that bad on the newest (NVIDIA?) GPUs, but from what I read Cycles barely compiles on even CUDA right now.