[TEST] Bmesh performance tests

mano-wii · August 4, 2015, 10:26am

This script executes a command repeatedly to calculate the duration.
Logically, is to be expected that the calculation of the length of an edge would take more than just get its coordinates.
But that’s not what happens. See:


print('-----------------------------------------------')
import bpy
import bmesh
import time


me = bpy.context.object.data
bm = bmesh.from_edit_mesh(me)
test = 1000000


time_start = time.time()
for i in range(test):
    a = bm.edges[0].calc_length()
print("built-in method Finished in %.4f sec" % (time.time() - time_start))


time_start = time.time()
for i in range(test):
    listv = bm.edges[0].verts
    v1 = listv[0].co
    v2 = listv[1].co
print("call binding method in %.4f sec" % (time.time() - time_start))

The result is:

-----------------------------------------------
built-in method Finished in 0.7535 sec
call binding method in 1.6501 sec

Note that just take the coordinates of the edge takes more than twice as long to calculate its length … sqrt((v2-v1).x2 + (v2-v1).y2 + (v2-v1).z**2)

Is this normal?

CoDEmanX · August 4, 2015, 10:56am

well, you call calc_length() on the edge directly, whereas getting its coordinates means to dereference Edge.verts. I’m not sure how costly that is.

BTW: you should do this for better comparability:

v1, v2 = bm.edges[0].verts

mano-wii · August 4, 2015, 11:17am

Ok, I edited, but had no difference
Would this slowness could be considered a bug?


print('-----------------------------------------------')
import bpy
import bmesh
import time

me = bpy.context.object.data
bm = bmesh.from_edit_mesh(me)
test = 1000000

time_start = time.time()
for i in range(test):
    a = bm.edges[0].calc_length()
print("built-in method Finished in %.4f sec" % (time.time() - time_start))

time_start = time.time()
for i in range(test):
    v1, v2 = bm.edges[0].verts
    v1 = v1.co
    v2 = v2.co
print("call binding method in %.4f sec" % (time.time() - time_start))

RickyBlender · August 4, 2015, 12:15pm

did a test and time I get is

~built-in method Finished in 0.3020 sec
call binding method in 0.6850 sec

and I don’t have a big machine!

now why do you want to know how long it takes to read an edge length 1 millions times ?

note : this
v1, v2 = bm.edges[0].verts
could be put outside the loop !
does not really need to read the same data over

thanks
happy bl

mano-wii · August 4, 2015, 1:23pm

The first script is just a demonstration. The application extends the objects with many different edges.

In my case I have these two functions:


def intersect_bound_box_edge_edge(a1, a2, b1, b2, precision = 0.0001):
    bbx1 = sorted([a1[0], a2[0]])
    bbx2 = sorted([b1[0], b2[0]])
    if ((bbx1[0] - precision) &lt;= bbx2[1]) and (bbx1[1] &gt;= (bbx2[0] - precision)):
        bby1 = sorted([a1[1], a2[1]])
        bby2 = sorted([b1[1], b2[1]])
        if ((bby1[0] - precision) &lt;= bby2[1]) and (bby1[1] &gt;= (bby2[0] - precision)):
            bbz1 = sorted([a1[2], a2[2]])
            bbz2 = sorted([b1[2], b2[2]])
            if ((bbz1[0] - precision) &lt;= bbz2[1]) and (bbz1[1] &gt;= (bbz2[0] - precision)):
                return True
            else:
                return False
        else:
            return False
    else:
        return False


def list_BVH(edges1, edges2, precision = 0.0001):
    tmp_set1 = set()
    tmp_set2 = set()
    for ed1 in edges1:
        for ed2 in edges2:
            if ed1 != ed2:
                a1 = ed1.verts[0]
                a2 = ed1.verts[1]
                b1 = ed2.verts[0]
                b2 = ed2.verts[1]
                intersect = intersect_bound_box_edge_edge(a1.co, a2.co, b1.co, b2.co, precision = precision)
                if intersect:
                    print('so')
                    tmp_set1.add(ed1)
                    tmp_set2.add(ed2)


    return tmp_set1, tmp_set2

If one of the lists in function “list BVH” has more than 250,000 edges, the execution of this script will last about 13 seconds on my computer. With over 90% of the time spent on calls elements in the modules mathutils and bmesh.

This is very boring. Though seldom we need to perform these functions in a mesh so dense.

If the same function was performed directly in C, no doubt it would be instantaneous.

RickyBlender · August 4, 2015, 3:14pm

just curious what is the bound box thing?

with 175 00 edges that is a big object

did u try to make a copy of all the edges first then do your script ?

note:
I did a speed test with python on my little machine
and I can do like 4 millions FLOPS in a second

doing like 1 millions edge in a second might be at the limit of what PC can do !

happy bl

mano-wii · August 4, 2015, 3:53pm

This function tests if the bounding volume of a segment intersects the bounding volume of another segment.
It can also be represented as follows:


def intersect_bound_box_edge_edge(ed1, ed2):
    a1 = (ed1.verts[0].co)
    a2 = (ed1.verts[1].co)
    b1 = (ed2.verts[0].co)
    b2 = (ed2.verts[1].co)
    
    bb1 = sorted([a1.x, a2.x]), sorted([a1.y, a2.y]), sorted([a1.z, a2.z])
    bb2 = sorted([b1.x, b2.x]), sorted([b1.y, b2.y]), sorted([b1.z, b2.z])
    
    return  (bb1[0][0] &lt;= bb2[0][1]) and (bb1[0][1] &gt;= bb2[0][0]) and\
                (bb1[1][0] &lt;= bb2[1][1]) and (bb1[1][1] &gt;= bb2[1][0]) and\
                (bb1[2][0] &lt;= bb2[2][1]) and (bb1[2][1] &gt;= bb2[2][0])

Thus, there are less edges to test intersection.
In theory it should speed things up.

I can try, but do not understand how this can help. One thing I noticed is that Python is slow to mathematics. And an effective binding of C has to be done with Cython.

CoDEmanX · August 4, 2015, 4:06pm

Is standard API an option? It’s definitely faster than bmesh module.

RickyBlender · August 4, 2015, 4:10pm

bound box is not the object bound box here !
it is another volume which is not like a cube

another way which could be faster
try to access numpy !

that might be faster but not certain
make a test and see what it gives

but if you can do it with cpython it could be faster
or another way might be to use the Cudapython
but sorry I cannot test that !

also do you have a big machine
cause my times was like 2 X faster then your times?

happy bl

mano-wii · August 4, 2015, 4:19pm

I tested the standard API. It is as slow as bmesh

The disadvantage of Cython is that you must compile on different operating systems. I’m still learning how to mess with Cython.

Could you reproduce the Bound Box code with Numpy? I only know the basics.

CoDEmanX · August 4, 2015, 9:53pm

Standard API is quite a lot slower in my test actually…
Whether bmesh is wrapped or not doesn’t seem to matter.

The difference in time for calculating edge length and retrieving vert coords could also be the overhead of creating a Python object from the coordinates (Vector) and exposing it from C to Python, whereas calc_length() returns a primitive and thus doesn’t need to create a complex Python object.

Bayesian · August 4, 2015, 11:01pm

In Python, every time you get an attribute like my_object.attribute, quite a bit of code has to be run. In pure Python there is the getattr and property machinery. In Blender it can be actually worse: The C API may decide to copy stuff, and/or allocate memory for temporary structures to return as “attribute”. Therefore, you have to treat the dot operator as expensive operation, and it may be the same with the square brackets. The latter can also involve a lot of non-obvious work, if it is used on a Blender object.

What is worse, whenever this is done in a loop, the allocated objects accumulate, and the garbage collector eventually has to get rid of them. So you may want to turn off the GC temporarily.

I can’t recommend Cython to plugin developers currently. It’s a great tool, and Blender would benefit greatly from using it, but it’s not going to happen until the core developers adopt it, and they probably would have to adopt Conda as a packaging tool, also. In any case, Cython will not help you a lot if you access Python API/Blender API functions, but the way to go would be to access Blender’s internal memory structures directly.

On the other hand, maybe it is time to create a numpy-based Mesh API…

Bayesian · August 4, 2015, 11:13pm

To illustrate:

print('-----------------------------------------------')
import bpy
import bmesh
import time




me = bpy.context.object.data
bm = bmesh.from_edit_mesh(me)
test = 1000000




time_start = time.time()
for i in range(test):
    a = bm.edges[0].calc_length()
print("built-in method Finished in %.4f sec" % (time.time() - time_start))


time_start = time.time()
edges = bm.edges
for i in range(test):
    a = edges[0].calc_length()
print("built-in method Finished in %.4f sec" % (time.time() - time_start))




time_start = time.time()
edge = bm.edges[0]
for i in range(test):
    a = edge.calc_length()
print("built-in method Finished in %.4f sec" % (time.time() - time_start))




time_start = time.time()
for i in range(test):
    listv = bm.edges[0].verts
    v1 = listv[0].co
    v2 = listv[1].co
print("call binding method in %.4f sec" % (time.time() - time_start))


time_start = time.time()
edge = bm.edges[0]
for i in range(test):
    listv = edge.verts
    v1 = listv[0].co
    v2 = listv[1].co
print("call binding method in %.4f sec" % (time.time() - time_start))


time_start = time.time()
edge = bm.edges[0]
listv = edge.verts
for i in range(test):
    v1 = listv[0].co
    v2 = listv[1].co
print("call binding method in %.4f sec" % (time.time() - time_start))

In Blender, calling the C API is often costing you performance, rather than improving it. Much of the confusion and slowness is caused by creating these vertex and “.co” objects…

Of course, in real code you’ll have to use a different edge on each iteration, the code above is just to illustrate.

mano-wii · August 5, 2015, 5:45am

@CoDEmanX,
if I understand well, the problem of slowness is not only in calling an object in C. But there when we reference a C object as a python object that is soon thrown into garbage.

I decided to do some tests making less reference to the object C and using list comprehensions:


print('-----------------------------------------------')
import bpy
import bmesh
import time


me = bpy.context.object.data
bm = bmesh.from_edit_mesh(me)


time_start = time.time()
list1 = []
for e in bm.edges:
    v1, v2 = e.verts
    list1.append((v1.co, v2.co))
print("Method1 Finished in %.4f sec" % (time.time() - time_start))


time_start = time.time()
list2 = [(v1.co, v2.co) for v1, v2 in [e.verts for e in bm.edges]]
print("Method2 Finished in %.4f sec" % (time.time() - time_start))


print(list1 == list2)

The result is promising with high Polly objects.


------------------------------
Method1 Finished in 0.6875 sec
Method2 Finished in 0.5653 sec
True

But the big difference is in the script execution shown before


def intersect_bound_box_edge_edge(a1, a2, b1, b2, precision = 0.0001):
    bbx1 = sorted([a1[0], a2[0]])
    bbx2 = sorted([b1[0], b2[0]])
    if ((bbx1[0] - precision) &lt;= bbx2[1]) and (bbx1[1] &gt;= (bbx2[0] - precision)):
        bby1 = sorted([a1[1], a2[1]])
        bby2 = sorted([b1[1], b2[1]])
        if ((bby1[0] - precision) &lt;= bby2[1]) and (bby1[1] &gt;= (bby2[0] - precision)):
            bbz1 = sorted([a1[2], a2[2]])
            bbz2 = sorted([b1[2], b2[2]])
            if ((bbz1[0] - precision) &lt;= bbz2[1]) and (bbz1[1] &gt;= (bbz2[0] - precision)):
                return True
            else:
                return False
        else:
            return False
    else:
        return False


def list_BVH1(edges1, edges2, precision = 0.0001):
    tmp_set1 = set()
    tmp_set2 = set()
    for ed1 in edges1:
        for ed2 in edges2:
            if ed1 != ed2:
                a1, a2 = ed1.verts
                b1, b2 = ed2.verts
                intersect = intersect_bound_box_edge_edge(a1.co, a2.co, b1.co, b2.co, precision = precision)
                if intersect:
                    tmp_set1.add(ed1)
                    tmp_set2.add(ed2)


    return tmp_set1, tmp_set2


def list_BVH2(edges1, edges2, precision = 0.0001):
    aco = [(v1.co,v2.co) for v1,v2 in [e.verts for e in edges1]]
    bco = [(v1.co,v2.co) for v1,v2 in [e.verts for e in edges2]]
    tmp_set1 = set()
    tmp_set2 = set()
    for i1, ed1 in enumerate(aco):
        for i2, ed2 in enumerate(bco):
            if ed1 != ed2:
                a1, a2 = ed1
                b1, b2 = ed2


                a1x, a2x = a1.x, a2.x
                b1x, b2x = b1.x, b2.x
                bbx1 = (a1x, a2x) if a1x &lt; a2x else (a2x, a1x)
                bbx2 = (b1x, b2x) if b1x &lt; b2x else (b2x, b1x)
                if ((bbx1[0] - precision) &lt;= bbx2[1]) and (bbx1[1] &gt;= (bbx2[0] - precision)):
                    a1y, a2y = a1.y, a2.y
                    b1y, b2y = b1.y, b2.y
                    bby1 = (a1y, a2y) if a1y &lt; a2y else (a2y, a1y)
                    bby2 = (b1y, b2y) if b1y &lt; b2y else (b2y, b1y)
                    if ((bby1[0] - precision) &lt;= bby2[1]) and (bby1[1] &gt;= (bby2[0] - precision)):
                        a1z, a2z = a1.z, a2.z
                        b1z, b2z = b1.z, b2.z
                        bbz1 = (a1z, a2z) if a1z &lt; a2z else (a2z, a1z)
                        bbz2 = (b1z, b2z) if b1z &lt; b2z else (b2z, b1z)
                        if ((bbz1[0] - precision) &lt;= bbz2[1]) and (bbz1[1] &gt;= (bbz2[0] - precision)):
                            tmp_set1.add(edges1[i1])
                            tmp_set2.add(edges2[i2])


    return tmp_set1, tmp_set2


import bpy
import bmesh
import time


me = bpy.context.object.data
bm = bmesh.from_edit_mesh(me)


a = bm.edges
b = list(set([x for y in [v2.link_edges for v2 in [v for v in bm.faces.active.verts]] for x in y]))


print('--------------------------------')
time_start = time.time()
result1 = list_BVH1(a,b)
print("list_BVH1 Finished in %.2f sec" % (time.time() - time_start))


time_start = time.time()
result2 = list_BVH2(a,b)
print("list_BVH2 Finished in %.2f sec" % (time.time() - time_start))


print(result1 == result2)

This last script presented encouraging results :


--------------------------------
list_BVH1 Finished in 18.44 sec
list_BVH2 Finished in 5.76 sec
True

I believe I can further accelerate if I improve the list comprehensions. Maybe if I execute the whole function inside a list comprehensions the result could be instantaneous.

ideasman42 · August 5, 2015, 6:07am

Python has its limits for these kinds of operations.

We recently added mathutils.bvhtree, A BVH-tree implimented in C,
Currently its used for faces only ~ (triangles, ngons for eg).

http://developer.blender.org/diffusion/B/browse/master/source/blender/python/mathutils/mathutils_bvhtree.c

http://www.blender.org/api/blender_python_api_2_75_3/mathutils.bvhtree.html

It has an overlap function, which I think is what you’re after?

We could extend it for more general use (where you can just put whatever data you like into each node).

mano-wii · August 5, 2015, 7:50am

BVH Tree really was a nice addition. The overlap function instantly detected faces that overlap in a mesh with more than 128,000 faces.

This was the code I used:


print('----------------------------------------------')
import bpy
import bmesh
from mathutils import bvhtree


me = bpy.context.object.data
bm = bmesh.from_edit_mesh(me)
face = bm.faces.active
linked_faces = list(set([x for y in [e2.link_faces for e2 in [e for e in face.edges]] for x in y]))
verts = list(set([v for l in [f.verts for f in linked_faces] for v in l]))
#print(verts)
polygons = []
for face in linked_faces:
    index = [verts.index(v) for v in face.verts]
    polygons.append(index)


verts = [v.co for v in verts]


bvh1 = bvhtree.BVHTree.FromPolygons(verts, polygons, epsilon=0.0)


bvh2 = bvhtree.BVHTree.FromBMesh(bm)


overlap = bvh1.overlap(bvh2)
print(overlap)
for f1, f2 in overlap:
    bm.faces[f2].select = True

However it just select the same faces that I used to test the overlay.
I did not understand how this “epsilon” works, I thought that increasing this value I would increase the amount of detected faces.

I would like very much if was added support for Vertex and Edges. thanks @ideasman42

CoDEmanX · August 5, 2015, 8:04am

What if you do this?

for f1, f2 in overlap:
    bm.faces[f2].select = True
bm.select_flush(True)

Or this?

for f1, f2 in overlap:
    bm.faces[f2].select_set(True)

mano-wii · August 5, 2015, 8:15am

Did the same thing. I’m testing now this result was due to small size of the faces

mano-wii · August 5, 2015, 8:34am

I decided to do a demonstration:

The script used was this:

print('----------------------------------------------')
import bpy
import bmesh
from mathutils import bvhtree


me = bpy.context.object.data
bm = bmesh.from_edit_mesh(me)
face = bm.faces.active


verts  = [v.co for v in face.verts]
polygons = [[face.verts[:].index(v) for v in face.verts]]


bvh1 = bvhtree.BVHTree.FromPolygons(verts, polygons, epsilon=0.0)


bvh2 = bvhtree.BVHTree.FromBMesh(bm)


overlap = bvh1.overlap(bvh2)
print(overlap)
for f1, f2 in overlap:
    bm.faces[f2].select_set(True)

Why in the first chosen face there was no overlap detection?

Bayesian · August 5, 2015, 11:58am

I don’t think that Python necessarily is limited in these kind of operations, but just that blender’s data structures are not laid out and exposed in a way which makes this feasible.