Blender Edit Mode Performance

Not to be pessimist, but I feel like from time to time we have some weeks where suddenly everyone goes crazy in this thread as it seems some big changes are about to finally happen, there’s something that improves a lot but next time I come back we’re still comparing with 2.79 :disappointed:

10 Likes

Indeed, it is a rendering thing, so the modeling team is not looking into it.

I still have a bit of work to do, but here is a teaser:

Apart from rendering, I am also working on vertex creasing (patch in review), and face holes (as seen in the video) which are faces that are deleted after the subdivision is done.

As for performance, it is a bit faster than CPU for first 4 level of subdivision, after that it is slower, however, there is some easy improvements that can be done.

39 Likes

man keep pushing and bring the gpu opensubdiv back to blender :+1:

Yeah, since 2.80 was huge regression in that regard. And even after this project is done it’s still gonna be like that. Remember, at the start of the project there were no expectarions from devs that they will make difference for more complex scene, e.g. with modifiers. And then there is undo…
Only after all those regressions resolved we’ll be able to finally wholeheartedly clap without thinking “2.79 was still better at X” and start real fun: comparing performance with other DCC :grinning:

1 Like

Hey, so happy to see someone is working on it!
A bit worried by your statements though:

As for performance, it is a bit faster than CPU for first 4 level of subdivision, after that it is slower, however, there is some easy improvements that can be done.

Doesn’t sound really good… In Blender 2.79, OpenSubdiv on GPU is like 5 times faster than CPU when using armature modifiers (rigged characters). Will it reach similar performances or is it this totally out of the scope?

1 Like

Don’t worry, it is just the beginning :slight_smile: . It took me a while to get there. For better performance, I would need to review some implementation details, which I am going to need to review anyway for the code to be acceptable for inclusion in a release. The big one is that I am somewhat forced to recreate subdivision data from scratch every time, otherwise there is no update in edit mode. So a better detection of what was modified during edits with some caching will definitely improve performance.

I also intend to publish my branch after I solved some of the rendering and performance issues and maybe open a thread here for testing and feedback. Stay tuned.

26 Likes

Why does it get slower after 4 level? I’m interested in any technical details.

Thanks for these details! Looking forward to testing it, congrats for all the work made so far. You have all support from the community, some of us have been waiting eagerly for this feature for many years :slight_smile:
OSD performances is an important bottleneck currently compared to other softwares. Best.

I guess it is slower even before 4 levels, but still fast enough to be real time.

The main reason I think (apart from poor caching) is that there seem to be too much data transfer to the GPU, OpenSubDiv asks us to allocate a GPU buffer big enough for the coarse and subdivided mesh data, which is ok, but a similar buffer is also allocated on the host side and copied entirely to the GPU. Ideally, we should allocate only a buffer big enough for the coarse mesh on the host and copy that to the device (but still allocate a buffer large enough for coarse + refined on the GPU).

3 Likes

The main issue is that OpenSubDiv gets quite slow once you get to the point of a detailed asset that might even be deformed by an armature or by other means. To be safe, we can’t say that we should settle with things feeling ‘interactive enough’, the ultimate goal rather would be deformation and editing of a complex mesh at 60 frames per second (even if we have to attach custom subdivision code to Blender’s OSD implementation).

I know the devs. can make it happen if they choose, let’s get it done.

2 Likes

For how many polys? real time means 24fps?

I didn’t say that the current performance was the final one. I was just keeping people in the loop as far as the current status of the code is. So please don’t draw any conclusions (I know you mean well).

Poly count doesn’t really matter at this moment. And yes, real time means 24 fps here.

3 Likes

The devs. are starting to multithread editmode now, small, two commits which both deliver small, but noticeable speedups (on top of the previous ones).
https://developer.blender.org/rBe4c6da29b2297cbf331bb3ac891959dbcc00ee73
https://developer.blender.org/rB6e999e08ab87712a9baca33ab691af2b83762b7e

The second commit only applies to meshes with a single material, but we have plenty of ways to mask now for single-material workflows.


EDIT; Just committed (by Campbell)
https://developer.blender.org/rBd8b8b4d7e297b5dceddeba3a60e71e13372484da

It should be noted that this is a 6.5x speedup for a specific step in the code, the actual user-visible speedup is probably going to be a bit less.

13 Likes

Some performance results with own build of latest master…
2.79b, 2.93.0: blender release download
master: build of commit 20ece8736f160442bc545bd0e1b822c05ee184de

Linux Mint 20
Intel i7-6700K 4.00GHz x 4
NVidia Geforce RTX 2060 SUPER
input: Wacom graphics tablet

Default cube subdiv’d in edit mode with 100 cuts then 2 cuts:
550,856 verts
1,101,708 tris

edit mode / vertex edit

config results (fps)
test verts selected proportional operation 2.79b 2.93.0 master
#1 1 off grab 4.55 4.55 9.70
#2 13,771* off grab 4.55 4.55 8.45
#3 1 on** grab 4.30 4.35 8.15
#4 13,771* on** grab 4.10*** 4.35 5.25
#5 1 on** rotate 3.90 4.00 7.25
*select random: ratio=0.025 (2.93, master) or 2.4929% (2.79b), seed=1

**size=1.0, falloff shape=smooth

*** 2.79b stalled at start of grab for about 30 seconds, then settled to 4.1fps


somewhat surprised, I thought newer blender was still playing catchup with 2.79 in this area?

13 Likes

More optimizations in threading, but this time the users that benefit the most are those still on CPU’s with few cores (ie. quad cores or otherwise older and/or low-end processors). The more cores you have the less likely you will notice the boost.
https://developer.blender.org/rB2330cec2c6a7632459c21f51723497e349a042bf
https://developer.blender.org/rB0eb9351296dbed5e7ac10ca56132d5e51e5f388d

8 Likes

Nice tests, maybe next time try the same operations on a more irregular high poly count mesh, something like the dragon that keeps popping up on this thread, or a displaced large terrain, or something else, to check normal calculation speedups.

Considering the cube example, I can subdivide the default cube to the point where the mesh itself takes gigabytes of RAM (yes, just for the cube) and it looks solid black in the viewport from the vertex drawing. In current builds Blender can easily handle the selection of one vertex and (slowly) moving said vertex in response to user input.

It is a huge step up from 2.80’s editmode (where if you have a reasonably complex mesh, you would not even be able to get it to run without having to kill the Blender process). Combined with Kevin’s work, the big weak points of Blender will shift to particle/hair editing and UVmapping (with the former on target to be rewritten anyway and the latter seeing improvements now for GSoC).

What I was looking for is not how many verts you can put on the default cube (they all still only facing 1 of 6 possible normal directions), but rather a mesh with irregular shape (like the dragon in the previous posts, etc) where pretty much every vertex have a different normal direction, the idea was to test how fast are the optimizations related to normal calculations like these:

https://developer.blender.org/rBc2fa36999ff25ce2d011971a460d7efa11705e57
https://developer.blender.org/rB496045fc30f72be8d2ca32394ed233266f043152

Now, this IS getting interesting!

I just tested the latest 3 alpha build, and compared the same model with a wing part that consists of 1.3 million faces.

In 2.93 moving a few thousand faces around: 1.41 fps
In 3 alpha: a whopping 3.77fps!

I compare this with C4D v23: 1.5fps

Wow. Just wow!

Same object, same selection, in Blender 2.79: 0.39fps

That is a huge difference - and outperforms Blender 2.79 by a factor of ten in this case.

I am so familiar and used to 2.8/9 now, that I had forgotten about 2.79’s absolutely abysmal edit mode performance across the board. Even simply orbiting the view is a test of patience.

Also, selecting the entire object in edit mode and moving all 1.3 million faces:

Blender 2.79: 0.39fps
Blender 2.93: 1.38fps
Cinema4D v23: 1 frame per ~11 seconds
Blender 3 alpha: 2.40fps

Anyway, I was hoping for a two-fold performance increase in regards to raw mesh editing, and the numbers show more than a three times performance increase for smaller edits (3.3 times faster on my machine!).

For large-scale mesh editing the performance increase is still 1.7 times faster!

Interestingly enough selecting and moving a single face reduced performance a tiny bit to around 3.2fps in v3 alpha.

I’d say that performance in edit mode is already very much improved!!!

My system specs: AMD 3900. 64gb, 1080GTX

I wonder if the performance would improve further with a newer 3080 card. Still trying to get my hands on one :frowning:

5 Likes

PS I can’t explain why some users insist that 2.79 has better mesh editing performance compared to the current 2.93 version. It is simply not true, it seems.

I agree with @tomjk - this seems more like a myth. Or perhaps a different test file was used?

Anyway, things are looking up - for me the current improvement in edit mode performance is already more than I had hoped for.

3 Likes