Accelerate your python with C

In BGMC26, I wanted to be able to read the pixels from a rendered image. The image was 128x128 pixels, and so had some 16 thousand color vectors. In a somewhat naive attempt, I converted all of those into mathutils.Vectors. The result was 12 frames per second with nothing else in the scene. After coding a small DLL to allow the image to be dealth with in C, the time went from the 10’s of milliseconds to the 1’s of milliseconds - about a 10x improvement.

Where is this applicable?
Writing your code in C is not what you want to do. You want as much of your code to be in python as possible. Why? Because in python you don’t have segfaults, you can access the BGE API, and use high level languages. In all the time I’ve been using BGE, this is the first time I have had a single piece of logic “heavy” enough to consider porting it to C.

So unless you’re doing image processing, neural networks or some other system handling a lot of data, you do not need to write code in C

Limitations

  • Overhead in passing values into C - so writing a C function to add two numbers is pointless
  • Have to consider cross platform support
  • Can only work with C “primitives” (you can use custom C structs in python, but you cannot use python classes in C)
  • C is harder to debug than python
  • C can segfault
  • and so on.

One caveat in BGE is that the python ctypes module caches the compiled C file, so you have to quit and re-open blender each time you recompile.

Hopefully by now I have dissuaded some 90% of the people hoping to use this to build MMORPG’s or minecraft clones. If you didn’t get the message: Writing your code in C instead of python is probably a waste of your time. But for the few other people out there who want to learn about how to use C in BGE, or just generally want to run C from python, I shall continue.

Getting Set Up
You will need:

  • A C compiler. I used gcc on linux, and mingw-64 on windows. I also used Make to automate the build process on linux, but that isn’t necessary
  • Something to write in C that is more worthwhile than writing it in python
  • Both C and Python skills. I ain’t going to teach you how to use either of these languages

***A Trivial Example - And how to handle cross platform
The most basic thing you can do is run a print statement in C, calling it from python.
So here is our c file. It should be named “simple.c”:


#include <stdio.h>
#include "simple.h"


DLLSPEC void test(void)
{
    printf("Tada
");
}

You may notice that there’s a weird DLLSPEC in front of the function definition. This is because windows has a flag to declare that a function should be accessible externally from a DLL. And so in the header file (simple.h) we need:


#ifndef __SIMPLE_H__
#define __SIMPLE_H__


#ifdef _WIN32
    #define DLLSPEC __declspec(dllexport)
#else
    #define DLLSPEC
#endif


DLLSPEC void test(void);


#endif


This defines the variable DLLSPEC depending on operating system. On windoes it becomes __dexlspec(dllexport), but on linux (or any other platform) it is left as an empty string.

Compiling on Linux

gcc -c -Wall -Werror -fPIC simple.c
gcc -shared -o simple.so simple.o

The first line has the -fPIC flag, which makes the code position-in-executable independant. This is useful if the .so file is going to be compiled into another C exectuable. Technically it shouldn’t make a difference here, but I didn’t see the point to risk it. The -Wall and -Werror flag are enabled because they always should be. This first line produces a fairly normal .o file.
The second line produces the .so file that we will work with from python

Compiling on Windows
It’s a bit different on linux, and I’m not sure if this is the best way (Windows is not my platform of choice), but it worked. I ran the following:


gcc -c -Wall -Werror -fPIC -DBUILDING_EXAMPLE_DLL simple.c
gcc -shared -o simple.dll simple.o -Wl,--out-implib,libsimple_dll.a

I am not sure at all what the stuff on the second line means. But without it it was producing invalid DLL’s. Perhaps someone more enterprising than me could explain it. I think the .a file isn’t needed here (indeed, you can delete it and it still runs), but eh, I’m not to sure.

Running it in python

import ctypes
import platform
import os

# Discover the OS
folder = os.path.split(os.path.realpath(__file__))[0]
if platform.system() == 'Linux':
    _accel_path = os.path.join(folder, "simple.so")
    print("Detected Linux: ", end='')
elif platform.system() == 'Darwin':
    _accel_path = os.path.join(folder, "simple.so")
    print("Detected Mac OSX: ", end='')
else:
    _accel_path = os.path.join(folder, "simple.dll")
    print("Detected Windows: ", end='')
    
#Load the library
print("Loading C functions from {}".format(_accel_path))
ACCELERATOR = ctypes.cdll.LoadLibrary(_accel_path)

# Run our function
ACCELERATOR.test()

Whowhee. So first we had to figure out what os we’re running, and then load the correct library. Finally, we can call the function.

***Function arguments
All useful functions I can think of make use of function arguments. And so we need to know how to pass things into or out of the functions. Fortunately the ctypes module allows us to do this, as well as working with structs and array.

If we had the function definition:


DLLSPEC void test(int a)

Then in python we can just use:

ACCELERATOR.test(5)

And something somewhere takes care of the magic for us. But for more complex things we have to declare their types from python. So if I were doing:


DLLSPEC void test(int* someArray, int arrayLength)

Then in python you’d have to use:


array_type = ctypes.c_int * 3
array = array_type(0, 1, 2)
ACCELERATOR.test(array, len(array))

And here you start seeing the overhead come to bite. Will transferring all the data into a c-type array be faster than just doing the operation in python? In my case in BGMC26, I was working with the output from a camera, and I had it in a python bytearray. As such I could run:


array.from_buffer(some_buffer)

And the data never went through python at all - both pythons bytearray and C’s arrrays are just pointers at the start of the data and some information about their type. As such, there was minimal overhead as the values were never looked at.

in typical C fashion, you can use arrays passed into the function arguments as outputs - which is exactly what I did in the BGMC game. I could have returned a struct, but that would have required working with structs. Working with structs is posisble, and is documented in the ctypes documentation - but I did not need to do it so I didn’t.

Thanks for the resource !


# ./loader.py
import ctypes
import platform
import os

__all__ = ['loadDLL']

def loadDLL(relpath):

    # Discover the OS
    folder = os.path.split(os.path.realpath(__file__))[0]
    if platform.system() == 'Linux':
        _accel_path = os.path.join(folder, relpath + ".so")
        print("Detected Linux: ", end='')
    elif platform.system() == 'Darwin':
        _accel_path = os.path.join(folder, relpath + ".so")
        print("Detected Mac OSX: ", end='')
    else:
        _accel_path = os.path.join(folder, relpath + ".dll")
        print("Detected Windows: ", end='')
        
    # Load the library
    print("Loading C functions from {}".format(_accel_path))
    return ctypes.cdll.LoadLibrary(_accel_path)

Then just:


import loader

myDLL = loader.loadDLL("simple")
myDLL.test()

I didn’t look a lot more but there might be a way to refresh Python’s cache in some way, or do more stuff but this should be enough for most simple cases :slight_smile:


EDIT:

Also found this: https://docs.python.org/3.5/library/ctypes.html#ctypes.CDLL
Which seem to be usable like so: https://stackoverflow.com/a/42523860/7983255


dll = CDLL('test')
dll.myfunc.argtypes = POINTER(CA),c_int
dll.myfunc.restype = None

dll.myfunc(ca_array,len(ca_array))

I wonder if this ctypes.CDLL adds the correct extension based on OS, gotta try things this WE :slight_smile:

Now I need a tutorial on converting python to C

Now I need a tutorial on converting python to C

Step 1: Learn C
Step 2: Fight the compiler (lost semicolon)
Step 3: Fight segfaults (null pointers or dodgy indexing)
Step 4: Learn C some more
Step 5: Hack around with makefiles and build systems
Step 6: Wonder why you weren’t writing in python
Step 7: Fight the compiler on a different OS
Step 8: Add another couple C files and bring in some libraries
Step 9: Fight the linker
Step 10: Wonder why it doesn’t run on anyone elses computer
Step 11: Try to figure out why your program causes windows to BSOD

Seriously, dynamic languages make your life a couple thousand times easier.
As per normal, the first step in porting is to learn the language you’re porting into. There’s no easy “convert” button.

actually, if I were using standard py there is Nuikta which is supposed to be 250% faster than CPython
something like this (designed to compile py libs and link against blender guts) we could accelerate anything quite easily. - http://nuitka.net/pages/overview.html

side issue - https://developer.blender.org/D2835

this should help also ?

I doubt it will work on embedded python, it probably only works on purely Python modules. Not sure though.