Advice for test OpenSubdiv integration

Hi Blender folks,

I was thinking of experimenting with Blender coding over the holidays and could use advice on the best path. I’ve been a tools engineer at Pixar for 16 years but have never worked in Blender. I have it building and running on my mac laptop which was nice and easy. The goal would be a first prototype OpenSubdiv integration, nothing too fancy but getting the OpenSubdiv code running.

One option would be to write a new subdivision modifier or add an option to the existing one. Likely this would be CPU only so not high performance, but useful. Would folks recommend extending the existing modifier or writing a new one? Any experts in that area with advice.

Another option would be using GPU accelerated drawing. This is likely harder, but with more payoff. Cycles looks like it has subd drawing already - I know Brecht has done some prototyping there. There is also the game engine code, is there another site with OpenGL drawing code for non-cycles drawing? Any pointers to the best place to start?



Hi Dirk, I think there has been some work done to integrate OpenSubdiv into Blender.

Best ask Brecht van Lommel whats the state of this (if I recall correctly he had something working already), Brecht sometimes works on stuff privately before releasing - if so, maybe now is a good time to move development into the open to allow collaboration.


I did a quick integration in Cycles with just the CPU code, and I’ve pushed that code to our main repository now behind some #ifdefs.

But more interesting is using OpenSubdiv in the subdivision surface modifier in Blender itself of course, because there the GPU acceleration is actually important, for Cycles it’s mostly useful to get consistent render results with creases and such. Sergey Sharybin actually did some work on this, to plug OpenSubdiv CPU code into the modifier, but it’s very early work. I have a patch from him from July 28th, I updated it slightly to apply to the latest Git master:

I’m not sure if this is the very latest version of that code, I’ll point him to this thread to find out.

Hello everyone,

I indeed do have slightly updated patch with some fixes. Main change is that it was moved “kindof” moved to CCGSubSurf which means the same code will be used for multires modifier as well.

Please gimme few hours to rebase that patch on top of current git and then i’ll also publish it.

Fantastic- thanks Brecht, Sergey, and Campbell! It’s encouraging that so much heavy lifting has happened already. I’m a newbie at Blender so it’ll take a bit to get up to speed. Sounds like the best thing is to work with Sergey on the modifier code. Go Blender.

Published latest changes:

Beware: it is highly WIP code!

Hi Dirk, Good to see you’ve got the Blender bug. Can you clarify what you mean about Opengl drawing code?. If your looking at an Opengl rendering engine to use I started on one a while back that id be willing to let be used for Blenders advancement, It’s far from feature complete half way through trying to implement physically based lights and GLSL shader models but been more focused on a new idea about mixing voxel DAG path tracing mixed with open subdiv catmull clark patch’s for final frame rendering and shading eval. If you check out my youtube page i have a video from a couple months back dev on the engine, Called NexusGL (code name only at the mo) : .

Let me know if that’s what you need. Also i mailed you about a month back on your youtube page about some idea’s im working on with questions regarding Open subdiv, Could you have a look. Cheers J


@Blurymind, Don’t know if to take that as a positive or negative. Not a plug i started the engine with my own financial goals in mind (Im poor) but considering the engine was to be 90% open for free with certain ties for commercial releases (say 5-10% profit on projects) but using Blender custom build as level and dev tool for end users i started thinking maybe i should just make it free for Blender use. Now Mantles being released it just means i have to move there if i want to really build something for the future (no one can ignore moving from 5-10K draw calls to 90-100K, also means i can get in at the start as an Indie to stop the big studio guys taking all) . Blenders game engine is redundant, half the reason i started the project. Now looks like for at least PC platforms OpenGL and DX are just hold backs. Unless i get my wish with further work and Mantle can interop with OpenGL (don’t care about DX), then things get even more interesting.

Pretty sure he meant a site within Blender’s source where OpenGL drawing is implemented.

3DLuve > way to shamelessly plug your engine :smiley:

dirkPixar > We are extremely happy to hear that work is done towards getting opensubdiv in b3d!
Thank you!!! This is truly a gift to the artist community.

Where i can donate u :smiley: i was waiting for this so much time.

Any updates?

I’ve learned a lot about Xcode, cmake, git, and patching diffs in the last few days. Sergey’s code is just about compiling, hopefully later in the week I’ll have something more intelligent to say. My first Mac development :slight_smile:

Newbie question: Is it possible for the subsurf modifier to do the subdivision step in CUDA or OpenGL compute and pass the result to imaging code? As Brecht mentioned, that would be the biggest improvement for users. The approach to development would be differently if we thought the best solution here was multithreaded C++ refinement that passed the results down to game engine drawing code -vrs- a solution where OpenSubdiv compute GPU buffers could be passed directly. How does game engine drawing code receive the output of modifiers?

For Presto (our internal animation system) we did a first OpenSubdiv integration that did CUDA refinement to subdivide N times and drew as a mesh of quads to OpenGL. Later we implemented patch drawing with tessellation shaders as a second pass, combined with a composable glsl system written by Takahito. We could follow the same two step deployment here if it’s easier.


You can use CUDA and OpenGL from SS modifier in theory, but i would not recommend you doing this. Mainly because object evaluation is gonna to be multithreaded in master branch pretty soon and accessing CUDA/OpenGL from multiple threads is not something you actually want.

The way to go would be to tweak CCGDerivedMesh in a way that it only performs CPU-side subdivision (or copying stuff from GPU to CPU) if there’re more modifiers on top of SS. And change it’s draw callbacks in a way they use GPU for subdivisions. This fits into Blender’s evaluation and drawing pipeline pretty well.

There’s one tricky thing about this tho: CCGDerivedMesh is also used for sculpting, and it’s basically SS-ed mesh with displacement map applied on top of it. This might be tricky to support with GPU-side SS code.

If you don’t feel like looking into this code, you might work on following:

  • Support non-closed manifolds in OpenSubdiv, which wasn’t supported in osd at the time of SIGGRAPH this year and which is rather unacptable to replace old code in Blender.
  • Support UV’s subdivisions (perhaps could be done in Blender side, at least for CPU-side evaluation, no idea about GPU side yet)
  • Loose edges support. If i’m not mistaken they’re also not supported and they’ll be annoying regression


After a bit of cmake wrestling I’m your code Sergey. A few thoughts-

It looks like we’re calling openSubdiv_createEvaluationDescr frequently with the current modifier. The creation of the hbr/far meshes, vertex buffers etc is pretty expensive. For deforming meshes I imagine the subsurf modifier “fires” late and things like deformers fire earlier. Are there easy ways for modifiers to construct heavy caches like the topology based ones infrequently, and in the common case just refine deformed point positions?

I was thinking of using some new classes in OpenSubdiv/osdutil, in particular topology/mesh/refiner/uniformEvaluator in:

In our deformers we would do something like create a uniformEvaluator in a cached computation and per frame push in coarse mesh points and pull out refined point positions. There it’s important to have static topology to avoid reconstructing opensubdiv objects.

Ideally this would be factored so that the hbr etc mesh construction happens once, and the

Dirk, just a heads-up, and you may already be aware, but #blendercoders on IRC is a much more efficient way to get the input and establish a good back-and-forth with the Blender developers than BlenderArtists is.

Also, it looks like some of your previous post got cut off there.

It is possible to cache stuff for sure. In fact we already have something like this in current CCGSubsurf implementation. But you would need to be careful adding caches, they might easily become PITA with upcoming multi-threaded dependency graph.

Also, i would design new subsurf modifier in a way it’s easy to add caching in the future, but mainly focus on getting core stuff working first.

And one more thing. Communication in artists’ forum is not really efficient, we don’t monitor it or so. You’d better drop into our ML or IRC room.

Thanks guys. I have time over the holiday break to dig into this and have a changed implementation working. I’ll check with developers in the mailing list or IRC room to get detailed comments and a review of the first pass diff. Changes are to use the new osdutil API to simplify the blender side code, and to use tessFaces on the derived mesh in the modifier. It does not yet cache as Sergey mentioned, so it is still super slow. Next steps are to implement topology based caching like in CCGSubsurf and multithreaded evaluation for speed, then to to see if I can get UV subdivision working.

However, I’m starting to think that the modifier stack isn’t the best place for subdivision to happen. No matter how fast I can make the opensubdiv part, we still need to construct a new DerivedMesh on each modifier call which is not ideal for speed. In Pixar’s Presto animation system, each model’s base mesh would be posed by ~2,000 deformers, then in the imaging system we use CUDA to do the subdivision step on the GPU as part of the draw. I think the equivalent place would be in source/blender/blenkernel/intern/cdderivedmesh.c, right where the glDrawArrays calls are made. For blender we’d probably want to use OpenCL or OpenGL compute for best mac compatibility (I’m doing blender development on a macbook pro), and fall back to multithreaded C++ compute if those aren’t available.

Hope to see you in IRC and the mailing list, would love to watch you guys tear this problem apart! It would be nice to keep this thread updated as well though, obviously we can’t all be monitoring IRC all of the time.

Cant the new be used? It seems like a perfect place to open a new task and have diff + comments together…

EDIT: Oh. there is one already… :