Python Snap Script.

I made a script that allows to obtain the nearest vertex of the mouse cursor. To those interested here is:


def snap_verts(region, rv3d, obj_matrix, mouse, verts, dist):
    t = time.time()
    ray_origin = view3d_utils.region_2d_to_origin_3d(region, rv3d, mouse)
    view_vector = view3d_utils.region_2d_to_vector_3d(region, rv3d, mouse)
    matrix_inv = obj_matrix.inverted()
    og = matrix_inv * ray_origin
    #bpy.context.scene.cursor_location = og
    vv = view_vector * obj_matrix
    lvl = [] # local verts list
    n1, n2 = 1,2
    n3, n4 = 2,0
    n5, n6 = 0,1
    for vert in verts:
        vert.select = False
        v = vert.co
        if  lvl == [] or \
            abs(lco.x - v.x) < dist2 and \
            abs(lco.y - v.y) < dist2 and \
            abs(lco.z - v.z) < dist2:
                #d = vv.cross(og - v)
                dx = abs(vv[n1]*(og[n2]-v[n2])-vv[n2]*(og[n1]-v[n1]))
                if dx < dist:
                    dy = abs(vv[n3]*(og[n4]-v[n4])-vv[n4]*(og[n3]-v[n3]))
                    if dy < dist:
                        dz = abs(vv[n5]*(og[n6]-v[n6])-vv[n6]*(og[n5]-v[n5]))
                        if dz < dist:
                            dist = max([dx,dy,dz])
                            dist2 = 2*dist
                            lco = v
                            lvl.append(vert)
                        else:
                            n1,n2,n5,n6 = n5,n6,n1,n2
                else:
                    n1,n2,n3,n4 = n3,n4,n1,n2


    print(time.time() - t)
    return lvl[-1]

One thing to note is that the default snap options in Blender are twice faster. For example:


bpy.ops.view3d.select(<i>extend=False</i>, <i>deselect=False</i>, <i>toggle=False</i>, <i>center=False</i>, <i>enumerate=False</i>, <i>object=False</i>, <i>location=(0</i>, <i>0)</i>)

How is this possible?
The ray_cast (another example) is almost instantaneous. What is the secret of ray_cast?

One reason that this is so slow is that you are calling the C API in your inner loops. Those functions do a lot of copying. Every time you “.” a blender object, something “heavy” happens. In this case it’s not as bad as with Image.pixels (which blows up quadratically with the size of the image), but just the work of retrieving the vertex objects will take copious time. It’s really funny how most of the time Python programmers call C extensions to make their programs faster, but in blender, interfacing to C code usually slows things down. Many properties have their own “magical” getter and setter functions, and some of them do a lot more work than you’d think.

Try to assign the x,y,z coordinates as local variables. Also I would scrap the “append” logic and just assign the current closest vertex to a local name. Appending to lists isn’t as bad as it used to be, but in this case it’s not necessary at all.

I think the ray_cast method is also defined in C. It’s faster because it can traverse the data structures without making Python objects out of everything. There are also fancy algorithms to search in 3D space, but I haven’t looked at the source to see what ray_cast is doing.

As a side-note: I think it would be nice to have something like numpy-mesh API. It would be class containing several numpy arrays as vertices, edges and faces and support array-oriented operations on them. You would be able to convert from the blender APIs to the numpy mesh and back, but that’s faster than having to disect the C data structure all the time.

Thank @Bayesian, I followed your recommendations.
Sometimes you cannot sign the x, y, z coordinates as local variables. I tried, looked like this:


def snap_verts(region, rv3d, obj_matrix, mouse, verts, dist):
    t = time.time()
    ray_origin = view3d_utils.region_2d_to_origin_3d(region, rv3d, mouse)
    view_vector = view3d_utils.region_2d_to_vector_3d(region, rv3d, mouse)
    matrix_inv = obj_matrix.inverted()
    ogx, ogy, ogz = matrix_inv * ray_origin
    vvx, vvy, vvz = view_vector * obj_matrix
    lvl = None
    for vert in verts:
        vert.select = False
        vx,vy,vz = vert.co
        if  lvl == None or \
            abs(lcox - vx) &lt; dist2 and \
            abs(lcoy - vy) &lt; dist2 and \
            abs(lcoz - vz) &lt; dist2:
                #d = vv.cross(og - v)
                dx = abs(vvy*(ogz-vz)-vvz*(ogy-vy))
                if dx &lt; dist:
                    dy = abs(vvz*(ogx-vx)-vvx*(ogz-vz))
                    if dy &lt; dist:
                        dz = abs(vvx*(ogy-vy)-vvy*(ogx-vx))
                        if dz &lt; dist:
                            dist = max([dx,dy,dz])
                            dist2 = 2*dist
                            lcox,lcoy,lcoz = vx,vy,vz
                            lvl = vert
    print(time.time() - t)
    return lvl

The problem is that you cannot specify which coordinated look first. Then decays performance.

So I stayed in middle ground. I didn’t notice the difference yet.


def snap_verts(region, rv3d, obj_matrix, mouse, verts, dist):
    t = time.time()
    ray_origin = view3d_utils.region_2d_to_origin_3d(region, rv3d, mouse)
    view_vector = view3d_utils.region_2d_to_vector_3d(region, rv3d, mouse)
    matrix_inv = obj_matrix.inverted()
    og = matrix_inv * ray_origin
    vv = view_vector * obj_matrix
    lvl = None
    n1, n2 = 1,2
    n3, n4 = 2,0
    n5, n6 = 0,1
    for vert in verts:
        vert.select = False
        v = vert.co
        if  lvl == None or \
            abs(lcox - v.x) &lt; dist2 and \
            abs(lcoy - v.y) &lt; dist2 and \
            abs(lcoz - v.z) &lt; dist2:
                #d = vv.cross(og - v)
                dx = abs(vv[n1]*(og[n2]-v[n2])-vv[n2]*(og[n1]-v[n1]))
                if dx &lt; dist:
                    dy = abs(vv[n3]*(og[n4]-v[n4])-vv[n4]*(og[n3]-v[n3]))
                    if dy &lt; dist:
                        dz = abs(vv[n5]*(og[n6]-v[n6])-vv[n6]*(og[n5]-v[n5]))
                        if dz &lt; dist:
                            dist = max([dx,dy,dz])
                            dist2 = 2*dist
                            lcox, lcoy, lcoz = v
                            lvl = vert
                        else:
                            n1,n2,n5,n6 = n5,n6,n1,n2
                    else:
                        n1,n2,n3,n4 = n3,n4,n1,n2


    print(time.time() - t)
    return lvl

I do not know C. The code ray_cast continues a mystery to me.

The problem is in operations: multiplication, addition and subtraction.
Apparently these operations are handled in C code, which can be compiled to machine code and executed faster.
“Python uses O(N^2) grade school multiplication algorithm for small numbers… for big numbers it uses Karatsuba algorithm.”

“Python is a very high-level programming language, which runs on a virtual machine which itself runs on your computer. As such, it is, inherently, a few order of magnitudes slower than native code. Translating the algorithm to assembly (or even to C) would result in a massive speedup – although it’d still be slower than the CPU’s multiplication operation.”

To learn a little bit about multiplication and the dependence of C code in python. Here are some examples.

So I have to “Translating the algorithm to assembly”. What does this mean? How to do this?

I have the feeling you are thinking too complicated for your level of programming experience. You obviously know your way around linear algebra. But if what I said doesn’t improve the performance, there is probably no quick fix.

The scientific python community, which does exactly this kind of math programming, solves the problem by vectorizing the algorithms. For example in numpy, if you want to multiply two vectors a and b with a thousand elements each, element-wise, you just say “a*b”, and then numpy takes care of doing that in a very fast C loop.

See for example this here: http://technicaldiscovery.blogspot.de/2011/06/speeding-up-python-numpy-cython-and.html

The problem with numpy for this particular use case is that you would still have to iterate over all the vertex objects to fill the arrays.

Another idea you may try is to find a way to let blender to the “projecting” for you. If all you’ve got is 2D coordinates for all your vertices, then a simple distance search should be fast enough. At the very least, you should look into the math-utils in the Blender API docs. Their cross-product implementation will probably be faster than yours. Again, I’m not that much into blender’s 3D code to know the details, but from what I understand mathutils a sort of 3D-specific, less elegant analogue to numpy.

Another point about the algorithms you mentioned: That’s not what is happening here. These algorithms are for a particular kind of integers, and the vertex coordinates are float…

The long-time fix is to invest some more time into learning advanced programming. For example Udacity has great courses on Python, for example “Design of computer programs” in which you will learn a ton about “idiomatic” Python.

Recently I started studing numpy. I converted all coordinates of the vertices for a numpy.array and multiplied (dot) this by the perspective_matrix*object_matrix in numpy. Really everything becomes faster (convert all the coordinates to 2D was much faster than doing the same with Python-only solution). But I do not know if I can reproduce the same practicality made in my script in numpy.
I do not want to install Cython or Weave since this algorithm will serve for an addon.

The secret of ray_cast is that it uses a data structure which accelerate specific operations such as ray_cast, nearst point etc (Bounding Volume Hierarchy and KDTrees). Nothing is free, there is no magic!

These data structures exchanging up front costs (the cost of calculating the accelerating structure) and memory useage (it stores more data) for faster operations where speed matters during user interaction.

The developers have worked hard to expose a few of these structures to the python API. I’m not sure how many people are using them but I am eagerly waiting the BVH exposure which will make raycasting via the python API even faster than it already is.

Read This, for finding nearest vertex to a given point in space in blender python
http://www.blender.org/api/blender_python_api_2_74_release/mathutils.kdtree.html?highlight=kdtree

Please log in ans subscribe to this issue. This is the branch where the developer Lukas T is working to expose BVH to python making accelerated ray casting of BMesh data possible. If more people subscribe to show this is useful to their addon development, it may encourage progress.
https://developer.blender.org/D966

From what I’ve studied now, another reason for the ray_cast be so fast, it is that it is written in C.

Essentially computers recognize only a specific language - the code or machine language is is extremely incomprehensible by us mortals

C programming language, translates to machine language by “compiling”. - from the program in a “higher level language” is generated a program “lower level” equivalent, which is mounted, generating the object code (a program in machine code)

The Python translates through “interpilação” (in Portuguese). - Internally, Python source code is always translated into a bytecode representation, and this bytecode is then executed by the Python virtual machine.

The fastest translation is for assembly - that is when each instruction of higher level language is translated directly into a machine language instruction. The program that does this is called assembler. Ex .: Assembly.

Certainly, the language has some to do with it. The BVH acceleration structure is also written in C! So even if you wrote your cross products, etc etc in C, the BVH data structure would still be faster, because it’s using a faster languate AAAND faster methods. By exposing the BVH to the python API, python is just telling the C code what to do, so the speed gains will be enormous, but the ease of writing your code will be…the same! The cost will be up front, when you calculate the BVH the first time.

-Patrick

The issue about C extensions for speed can get a bit complicated. I think in Blender often calling C functions inadvertently actually slows stuff down as often as not. C code can handle blender’s data structures faster because it doesn’t have to create all these intermediate python objects which are then immediately discarded again. In a Numpy-Mesh that wouldn’t happen and you could probably implement BVH entirely in Python and still beat the above solution performance-wise.

I don’t want to disrespect the awesome work that has gone into Blender’s API. Sometimes it’s just very painful to work with. Another interesting case study is to look at the video sequencer operators, especially “effect_strip_add”. It looks as if someone wanted to avoid writing several operators in C, so they just dumped everything into one big operator, put the actual function name into a parameter (type) and made the rest of the parameters contingent on the type parameter. And the most important parameters to that operator aren’t even passed as operators, they are taken from the context. rant-finished :wink:

Any, way, I think the Blender API is awesome most of the time.

It is very common to find web pages with the theme “Why Python is so slow?”.
Actually the python is fast (depending on the interpreter). The problem is when compared to C (which has a different code translation).
By my tests the performance difference seems to be too big.
With kdtree after it builds the entire tree, find the closest point to the cursor is instantaneous, as well as ray_cast.
However the “bpy.ops.view3d.select” is slow. I do not know why.

I followed various tips on the internet for the fastest possible python code. And I could pair with the “bpy.ops.view3d.select”. Here is the result:
LMB - My code
RMB - bpy.ops.view3d.select
ESC - Cancel


import bpy, bmesh, time
from mathutils import Vector
from bpy_extras import view3d_utils


def snap_vert(region, rv3d, obj_matrix, mouse, verts, dist):
    ray_origin = view3d_utils.region_2d_to_origin_3d(region, rv3d, mouse)
    view_vector = view3d_utils.region_2d_to_vector_3d(region, rv3d, mouse)
    matrix_inv = obj_matrix.inverted()
    og = matrix_inv * ray_origin
    vv = view_vector * obj_matrix
    lvl = None
    n0,n1,n2 = 0,1,2
    for vert in verts:
        vert.select = False
        v = vert.co
        if  lvl == None or \
            abs(lcox - v.x) &lt; dist2 and \
            abs(lcoy - v.y) &lt; dist2 and \
            abs(lcoz - v.z) &lt; dist2:
                vd1 = og[n1]-v[n1]
                vd2 = og[n2]-v[n2]
                dx = abs(vv[n1]*vd2-vv[n2]*vd1)
                if dx &lt; dist:
                    vd0 = og[n0]-v[n0]
                    dy = abs(vv[n0]*vd2-vv[n2]*vd0)
                    if dy &lt; dist:
                        dz = abs(vv[n0]*vd1-vv[n1]*vd0)
                        if dz &lt; dist:
                            dist = max([dx,dy,dz])
                            dist2 = 2*dist
                            lcox, lcoy, lcoz = v
                            lvl = vert
                        else:
                            tmp = n1
                            n1 = n2
                            n2 = tmp
                    else:
                        tmp = n1
                        n1 = n0
                        n0 = tmp
    return lvl


class ModalSelectOperator(bpy.types.Operator):
    """Select a vertice with the mouse"""
    bl_idname = "object.snap_select"
    bl_label = "Snap Select"


    def modal(self, context, event):
        if event.type == 'MOUSEMOVE':
            self.mouse_co = event.mouse_region_x, event.mouse_region_y


        if event.value == 'PRESS':
            if event.type == 'LEFTMOUSE':
                print('-----------------------')
                print('Test object.snap_select')
                t = time.time()
                vert = snap_vert(self.region, self.rv3d, self.obj.matrix_world, 
                                 self.mouse_co, self.bm.verts, 100)
                vert.select = True
                context.scene.objects.active = context.scene.objects.active
                print(time.time() - t)


            if event.type == 'RIGHTMOUSE':
                print('----------------------')
                print('Test ops.view3d.select')
                t = time.time()
                bpy.ops.view3d.select(location=self.mouse_co)
                print(time.time() - t)


        elif event.type in {'ESC'}:
            context.area.header_text_set()
            return {'CANCELLED'}


        return {'RUNNING_MODAL'}


    def invoke(self, context, event):
        if context.mode == 'EDIT_MESH':
            self.obj = bpy.context.edit_object
            self.bm = bmesh.from_edit_mesh(self.obj.data)
            self.region = context.region
            self.rv3d = context.region_data
        else:
            self.report({'WARNING'}, "not edit mode")
            return {'CANCELLED'}


        context.window_manager.modal_handler_add(self)
        return {'RUNNING_MODAL'}


def register():
    bpy.utils.register_class(ModalSelectOperator)


def unregister():
    bpy.utils.unregister_class(ModalSelectOperator)


if __name__ == "__main__":
    register()

It is really fast.But is far from being instantaneous.

what kind of models/scene are you testing this on? I think the viewport select is going to test every model in the scene right? perhaps that is one issue.

-Patrick

I’m testing with a high poly model (a half million vertices).

I use an Intel i5 notebook with AMD dedicated card. The “bpy.ops.view3d.select” is faster than the normal way of selection. No matter the mesh, if it has some 500,000 vertices, selecting the nearest vertex of the mouse cursor will take about 0.2 seconds.