By rendering I mean both viewport visualization and final renders. Cycles requires a subdivision surface modifier to be the last in the stack for adaptive subdivision, this is why I mentioned it.
I think it’s pretty much a limitation of the OSD algorithms.
AFAIK it’s the same in Maya, GPU acceleration only works when subdiv is used as last modifier (it’s a mesh setting in Maya).
Not sure about Max, I only have an old 2018 3dsMax version for testing, which doesn’t work great with OSD (only wireframe display).
EDIT: Same behavior in Max 2021, if GPU is enabled, only the OSD modifier result is displayed, without applying next modifiers in the stack. I guess once data is passed to the GPU, it’s not possible to send it back to the CPU (other modifiers in the stack are CPU only). This is why this subdivision setting on mesh level is necessary, like Maya does.
Thanks a lot @KWD for implementing this!
is this planning for blender 3.0?
Yes (at least for the subdivision, the settings as part of the mesh datablock might be for a later version).
I can not wait to see it in master
I have waited for it 2 years since 2.80 ))
Kévin I am glad you working on it
BMesh: use threading to count total selection.
During selections the total selection is refreshed at the end. This
process was done single threaded. This patch will do a range threaded
approach when the bmesh lookup tables are available.Master: 0.043612s Threaded 0.017964s.
They seem to be going even after fraction of a second of performance.
Giving a chance, I don’t think the devs are going to neglect performance.
Also, 2 new performance commits to master
DrawManager: Cache material offsets
Draw Cache: avoid recalculating ‘poly_normals’
To be fair, many of them are like this. There are not a lot of commits that got us more than 1 FPS increased performance most of them gives something about 0.2-0.4 FPS. It’s just sometimes it’s 0.2 FPS from 3.1 to 3.3, and sometimes it’s 0.2 FPS from 15.0 to 15.2
Performance for this kind of operation is more important after recent post at developers blog. If you want interactivity then selection should feel instant.
The number given is in seconds or rather ml second, not fps, but I do get your point …
I still think it’d make a lot more sense to keep it in the modifier and simply display a warning if the modifier is not the last one, instead of going this schizophrenic way of cluttering the mesh datablock UI with one special snowflake modifier which user may, or may not need to use
Warnings in modifiers are not something new anyway:
If cycles X moves adaptive subdivision out of experimental mode and streamlines how it works (like octane) then we won’t need to add a subdiv modifier to get adaptive subdivision.
Latest test :
400 000 poly model with subdiv modifier
Blender 2.75 beta with open subdiv mode that works in edit mode and GLSL mode : Subdiv level 1 : 5,3 fps. Subdiv level 2 : 5 fps
Blender 3.0 alpha : Subdiv level 1 : 0.9 fps. Subdiv level 2 : 0.4 fps
I hope that Blender 3 perfs will reach the perfs of this old B 2.75 beta build
I’d hope at least 25 fps/40ms frame time for a ~500k vertices model.
Hi all, I opened a thread to talk about GPU OpenSubdiv acceleration (and a couple extra features), if you want to try it out. I was forced to rewrite the implementation, so there is one expensive computation that still is done on the CPU, however it is cached and only reevaluated when the topology changes (e.g. adding or removing an edge loop) so only Edit Mode is really affected, Pose Mode should be fine.
Unlike in my previous post in this thread, the subdivision settings are back in the modifier. Putting them in the mesh datablock will be done later.
Great! Can we not mention subdiv ever again in this thread from now on?
None of the examples I’ve posted have been with a subdiv modifier active, and performance has still been showstopping levels of bad (comparing with Max that we use at work).
Who wrote that?
As this topic is actively being worked on, it makes a lot of sense to disentangle the different aspects which are the cause of the performance issues. This makes it a lot easier to understand what is going on, rather than having everything in one bucket and trying to guess the context.
Nevertheless, that doesn’t mean subdivision can’t be mentioned in this thread anymore. I am sure there are plenty of reasons why that would make sense. For the sake of clarity, it makes sense though to use the other thread when it is clearly about subdivision.
Depsgraph: support flushing parameters without a full COW update
Depsgraph: remove redundant mesh data duplication in edit-mode
Testing shows a significant overall speedup when transforming:
~1.5x with a subdivided cube 1.5 million vertices.
~3.0x with the spring mesh (edit-mode with modifiers disabled, duplicated 10x to drop performance).
Important to note for future from parches:
Currently only mesh data blocks are supported,
other data-blocks can be added individually.
Armatures might be able to skip a full COW copy too.
Updated previous post with test data from latest improvements in master. Also added comparison between latest and previous tested master (rightmost column).
I think changing proportional radius size felt a bit snappier too, but didn’t test for it.
Thanks for testing with such a detail.
The numbers on this patch is huge
https://developer.blender.org/D11599
During a mesh transformation in edit mode (Move, Rotate…), not the entire draw cache needs to be recreated.
But this occurs because DEG_id_tag_update(tc->obedit->data, ID_RECALC_GEOMETRY), chain a call to BKE_object_data_batch_cache_dirty_tag that tags all the batch cache to be redone
This patch proposes not tag dirty All, but only what participates in the deformation of geometry.
Depsgraph Changes:
Currently, the graph can be compared to this simplified one:Where DEG_id_tag_update(id, ID_RECALC_GEOMETRY) triggers geom_eval_mesh or geom_eval_obj_init.
And DEG_id_tag_update(id, ID_RECALC_SELECT) triggers update_select or update_select_obj.This patch proposes to separate “tag_dirty” from “geom_eval” and create a separate node for it (update_all in the image bellow).
In addition to adding an node to update_deform
The graph becomes something like this:Benchmarking:
master: | patch: | |
---|---|---|
large_mesh_editing: | Average: 16.727632 FPS | Average: 26.424897 FPS |
rdata 9ms iter 26ms (frame 60ms) | rdata 0ms iter 19ms (frame 38ms) | |
large_mesh_editing_ledge: | Average: 17.761902 FPS | Average: 28.070558 FPS |
rdata 9ms iter 24ms (frame 56ms) | rdata 0ms iter 18ms (frame 36ms) | |
looptris_test: | Average: 5.537827 FPS | Average: 5.456050 FPS |
rdata 11ms iter 26ms (frame 169ms) | rdata 11ms iter 28ms (frame 172ms) | |
subdiv_mesh_cage_and_final: | Average: 2.095824 FPS | Average: 2.140402 FPS |
rdata 7ms iter 21ms (frame 242ms) | rdata 0ms iter 20ms (frame 237ms) | |
rdata 7ms iter 22ms (frame 233ms) | rdata 0ms iter 21ms (frame 227ms) | |
subdiv_mesh_final_only: | Average: 6.626541 FPS | Average: 7.974115 FPS |
rdata 3ms iter 13ms (frame 145ms) | rdata 0ms iter 10ms (frame 122ms) | |
subdiv_mesh_final_only_ledge: | Average: 6.590914 FPS | Average: 7.978224 FPS |
rdata 3ms iter 13ms (frame 143ms) | rdata 0ms iter 10ms (frame 121ms) |