Replacing BGE memory allocation functions

  • Edit * meant to post this in the BGE forum :expressionless:
    Hi, while trying to debug memory problems I noticed you could replace C++'s allocators with different ones.

For people who are not so much into development, when the BGE creates new game objects, scenes, meshes etc, it currently asks the Operating System for memory and frees when its done.
This can be slow especially if you do many small alloc’s and free’s.

This is an area where python has its own allocator thats optimized for continues small allocations and freeing so I thought If try replace the BGEs allocator with pythons.

I only had a small speedup ~94fps -> ~98fps on YoFrankie level1home but I think it would depend a if your scenes uses addObject much.

patch on trunk, scons build files. most BGE types subclass from this so many mallocs will be redirected to using this function.

Index: source/gameengine/SceneGraph/SConscript
===================================================================
--- source/gameengine/SceneGraph/SConscript    (revision 22289)
+++ source/gameengine/SceneGraph/SConscript    (working copy)
@@ -5,6 +5,7 @@
 sources = env.Glob('*.cpp') #'SG_BBox.cpp SG_Controller.cpp SG_IObject.cpp SG_Node.cpp SG_Spatial.cpp SG_Tree.cpp'
 
 incs = '. #intern/moto/include'
+incs += ' ' + env['BF_PYTHON_INC']
 
 cxxflags = []
 if env['OURPLATFORM']=='win32-vc':
Index: source/gameengine/SceneGraph/SG_DList.h
===================================================================
--- source/gameengine/SceneGraph/SG_DList.h    (revision 22289)
+++ source/gameengine/SceneGraph/SG_DList.h    (working copy)
@@ -29,6 +29,7 @@
 #ifndef __SG_DLIST
 #define __SG_DLIST
 
+#include "../Expressions/KX_Python.h" // for PyMem_MALLOC and PyMem_FREE
 #include <stdlib.h>
 
 /**
@@ -41,6 +42,12 @@
     SG_DList* m_blink;
 
 public:
+    
+    /* replace allocators */
+    void *operator new( unsigned int num_bytes) { return PyMem_MALLOC(num_bytes); }
+    void operator delete( void *mem ) { PyMem_FREE(mem); }    
+
+
     template<typename T> class iterator
     {
     private:
Index: source/gameengine/Rasterizer/RAS_OpenGLRasterizer/SConscript
===================================================================
--- source/gameengine/Rasterizer/RAS_OpenGLRasterizer/SConscript    (revision 22289)
+++ source/gameengine/Rasterizer/RAS_OpenGLRasterizer/SConscript    (working copy)
@@ -7,7 +7,9 @@
 incs += ' #source/blender/gpu #extern/glew/include ' + env['BF_OPENGL_INC']
 incs += ' #source/blender/gameengine/Ketsji #source/gameengine/SceneGraph #source/blender/makesdna #source/blender/blenkernel'
 incs += ' #intern/guardedalloc #source/blender/blenlib'
+incs += ' ' + env['BF_PYTHON_INC']
 
+
 cxxflags = []
 if env['OURPLATFORM']=='win32-vc':
     cxxflags.append ('/GR')

Very interesting.

Would this mean that scripted particle effects involving many billboards added and removed would sufer a largely reduced framerate overhead? I’m assuming that the adding/removing of billboards would be just the small addObject/endObject that would fit your description.

Also, a game involving, say, shooting down a large number of smaller objects (eg. 3d space invaders, R-Type clone, etc.) would also benefit from this, right?

yep, thats the idea, If anyone wants to make a test scene that adds/removes 1000’s of objects a second it would be a good testcase to see how much better pythons allocator is for thrashing memory. :slight_smile:

I should have added, this is comparing pythons allocator to the OS’s, so it will vary from linux/win/osx
*updated the patch to use a class that overrides more allocs *