Thanks for your reply. The subdiv branch hasn’t been updated since June 19, then I’m wondering if Kévin is still on it. These optimizations are announced since Blender 2.80 actually (July 2019), then I’m used to moderate developers estimation dates It could be Blender 3.0, 3.1…
I have tried it earlier, but did not lead to noticeable speed up in my test cases. And if UV and shading is not yet supported yet, it makes it quite useless for real projects at the moment. Will wait for the completion…
I’m well aware of its tumultuous history. I think if you read my first post in this thread that should be pretty obvious I was very surprised to see Kevin work on it.
My questions were so I could set my expectations correctly. And I did.
There is a reason I just went for the positive route and decided to post a fun walk cycle and the improvements I noticed. It does not mean I am naïve of development or the history of Blender and OpenSubdiv. I just don’t feel like beating a dead horse and rather keep this thread positive.
Am I surprised nothing happened for close to a month on this branch? Not really. I’m subscribed to this thread. Ill get notified when something happens. Little else you can do.
I spent most of the last few weeks fixing various issues reported here and improving the implementation’s architecture, I’ll update the builds in the coming days, once some of the remaining draw issues in edit mode are fixed.
As for the branch, I am using various local branches to do the work, where I am free to edit the history, and use a rebase workflow. I tend to then bulk update the branch, although I haven’t pushed in a while. Development is active, don’t worry about it.
This surprises me btw. I can only think of two cases where this could be happening for you. Either the subdiv modifier was not latest in the stack (it has to be) or your video card is actually not modern enough to support OSD GPU acceleration.
Both cases should be easy enough to check. I am curious of your findings.
The Subdiv modifier is last in the stack of course.
Gtx1080 (I would assume it to be modern enough)
File testing:
As said in a previous post, most files containing full rigged characters crashed on opening. Those that did not crash did not lead to any speed up unfortunately.
I’ve setup a new, simple dedicated test file. Basically, a simple subdivided box (11k tris) skinned to a plain chain of bones, with a Subdiv modifier level 2. And there were interesting results this time.
Blender 2.93: 20 fps
Blender 3.0: 23 fps
Subdiv branch (Subdiv level 2): 36 fps
Subdiv branch (Subdiv level 4): 28 fps
Versus Blender 2.79:
No OSD: 17 fps
OSD (GLSL Compute, Subdiv level 2): +60 fps
OSD (GLSL Compute, Subdiv level 4): +60 fps
Therefore, this is definitely promising, but not as good as Blender 2.79 perf yet, will wait and see.
Glad to see you got to see some improvement at least. Although I am curious what seems to be the limiting factor when it comes to framerate with your current rig(s). Is the armature/constraint setup so taxing that that is what limits the frame rate?
I am also quite shocked how much more performant the 2.79 implementation seems to be. Of course there can be many reasons for it (it could not support the full feature set of OSD for example?) but the enormity of the gap really surprised me.
Kevin talked about one “obvious” performance step that still needed to be taken but if I’ve interpreted it correctly that had to do with edit mode only.
I would be curious to hear what Kevin thinks about this performance gap. Obviously the current implementation is wip but I did not have the impression there was room to “double” the performance. Perhaps I was wrong?
Note I benchmarked it with a subdivided cube ~11k triangles deformed by 4 bones, and it’s only +13 fps above official Blender version.
Rigged characters may be a lot more complex (50k triangles or more) with more modifiers, hundred of bones, shading/textures, shape keys and objects in scene decreasing performances. Benchmarking such situation is still pending (currently it leads to instant crashes) and the sudiv branch is WIP, just to say, don’t take it for granted yet.
Funnily enough “Use Limit Surface” have nothing to do with optimization, it’s just OpenSubdiv setting. It’s interesting that there is difference at all. And I would expect it to be faster with this setting on, as osd generates less patches in this mode (although there might be difference between bicubic and ilnear patch).
Not sure what you mean with “less patches”, as far as I understand, this setting places vertices at the exact position on the surface, as if there were infinite levels of subdivision. This probably adds more calculations/precision in the loop, leading to decrease performances. You can see that decreasing the Quality setting from 3 (default) to 1 will already increase performances. I would assume that if Limit Surface is disabled, it is somewhat similar to Quality set to 0, skipping some maths computation.
I think your interpretation of what limit surface does is correct.
If you want to get a close approximation of the rendered result (i.e. a “high” subdivision level) in the viewport turning on limit surface allows you to achieve this without having to raise the viewport level too much.
Obviously both turning on limit surface and raising viewport level have a performance penalty. I don’t know how that balances out and its probably subjective what a “pleasing” result would be.
I am guessing that when opensubdiv gets proper GPU acceleration its probably more efficient to simply raise the viewport level instead of turning on limit surface. In my experience the difference between the limit surface and subdiv level 3 is really minute.
With that said this is just my interpretation based on experimentation and reading a bit about it here and there.
Disclaimer: I’m the one who added “Use limit surface”
Not sure what you mean with “less patches”, as far as I understand, this setting places vertices at the exact position on the surface, as if there were infinite levels of subdivision.
You are correct, and the limit surface is represented by set of parametric, bicubic patches that blender uses to calculate final vertex positions. What’s less known, is that when limit surface is disabled, what blender does is exactly the same, but instead of bicubic patches osd returns bilinear, flat patches corresponding to polygons of subdivided mesh.
What I meant by less patches, is that in limit surface mode you will have (roughly) one patch for every quad in input mesh, but when this is disabled you will get 2^n patches per quad.
It’s not obvious which one should be faster, because all the difference is in opensubdiv code, not blender’s.
Alright, you know what you’re talking about then Thanks for the details, interesting!
In this case it seems there is no direct correlation between the number of patch returned and the performances then. Maybe someone experienced in the OSD source code could explain this (I’ll pass )
Awesome Kevin! Thanks for the new build. Time to commence the testing. First experiments regretfully show that crashing upon opening files still occur. If I take the Settlers characters, all of them crash this build instantly.
I can open these files in the latest 3.0 alpha build without a problem, so it looks like its absolutely related to the opensubdiv build. I will try to replicate it with a minimal setup so I can provide a proper report.
This appears to be a different crash that the ones I fixed, this one is in the dependency graph, while the others were in the GPU code. Let me try to quickly fix it and make new builds.
(Assuming you talk about the reb nelb walk cycle file you posted a few days ago.)