How to correctly analyse and profile your game

Hello everyone.
Recently I’ve noticed a lot of similar threads asking questions regarding the cause behind slow down during runtime.
Seasoned users will normally be able to deduce the cause relatively quickly, but evidently newer users are unaware of the process that they ought to follow to reach the same end conclusion.

Here is outlined the rough step-by-step process to find and correct slowdown in games.

Causes of slowdown:

  • Physics
    [LIST=1]

  • Complex meshes used in the Physics pipeline

  • Complex meshes used in the Render pipeline

  • Using ReplaceMesh() on complex geometry (in Python or the Edit Object actuator)

  • Large numbers of Collision, Touch, Near, Mouse or Radar sensors, particularly for complex geometry.

  • Logic

  • Nested iterators (e.g for loops) in the Python controllers.

  • Animations

  • Complex meshes deformed by bones

  • Large numbers of bones in armature

  • Large numbers of armatures (beyond twenty can become slow on many systems)

  • SceneGraph

  • Heavy interaction with the scenegraph (such as adding or removing large numbers of objects during a single frame.

  • Rasterizer

  • Large numbers of objects in the camera view which aren’t occluded using occluders.

  • Large texture sizes, particularly non-power-of-two (2 ** n) scales.

[/LIST]

Finding issues:

  • With the render mode set to Blender Game, select the Game menu dropdown.
  • Enable “Show framerate and Profile”.
  • Look at the progress bar and the time in (ms) for every entry.
  • Large progress bars indicate that this is the slowest entry, but this is relative to the other entries.
  • The time is in milliseconds. When the cumulative (total) time of all the systems per frame exceeds 16ms, the framerate drops below 60 Frames Per Second.
  • Using the above points, isolate the slowest entry and look at the Causes of slowdown list above to find the cause.
  • Use the below list to solve the issue.

Solving issues:

  • Physics
    [LIST=1]

  • Reduce the number of polygons in the mesh. Using normal maps will help retain some illusion of detail at a greater framerate.

  • Use simple “collision meshes” for objects with large detail. The Physics mesh should be an approximation of the “render mesh”.

  • Set the render mesh to “No Collision”

  • Choose an appropriate physics type and bounds for the collision mesh.

  • Parent the render mesh to this physics mesh

  • Apply all movement and collision logic to this mesh.

  • Share Physics sensors (listed in the causes list) where possible. Use Python to read the objects interacting with the sensors. For mouse over sensors, read the hitObject from one sensor rather than using a single sensor per object.

  • Logic

  • Remove as much logic as possible from the inner loops.

  • Use generators / iterators instead of creating lists where possible.

  • Try and use caching / lazy loading wherever possible.

  • Iterating over dictionaries is faster than lists (I believe), adding and removing from dicts is faster than lists. Sets are also faster than adding/removing from lists for the same reason that hashing the key is fast.

  • Animations

  • Simplify animation meshes where possible

  • Use hitboxes parented to the bones rather than the animation mesh for collision (You can use reinstancePhysicsMesh() to update the collision information each frame, but it’s very slow).

  • Stopping animations once out of the frustum appears to benefit the framerate when applied correctly.

  • Reduce the number of bones where possible

  • Use the BGE IK solver

  • SceneGraph

  • Avoid adding/removing large numbers of objects each frame. Spread over n frames where needs demand.

  • Rasterizer

  • Join all separate meshes where possible (be careful with objects that need to move / collide separately). This is because Batching isn’t intelligent within the BGE at present (Moerdn’s patches should help this in future).

  • Make all textures powers of two sizes (this has been fixed I believe, perhaps check this). However, the loading time is still slower as they need to be converted.

  • Use the Occluder Physics type to omit geometry from the render pipeline. Be aware that many occluders can have a detrimental effect upon the render pipeline. Change the precision in the Scene panel to fit your needs.

[/LIST]

Hopefully that’s just about everything covered. If anyone has any additions to make that are general enough to warrant adding to this list, please leave a comment.

Regards, Angus.

Excellent thread idea! I’m sure this will accelerate the workflow of many newcomers, whilst reducing the bloat on the game engine forums. Good job!

PS: I vote for sticky.

Such threads do not become stickies, they are placed in resources … done

Agoose: Good thread, well done!
Reminds me of the best-practices threads ;).

Also remember to disable vsync for the most accurate results from the profiler.

is there a way to see how much vram I have commited to the BGE?
Ram?
Cpu threads?

For vram, try out gDEBugger, it can tell you a lot about your OpenGL usage. As for RAM and CPU usage, your OS should have a tool for that (Task Manager (Windows), top (Linux), System Activity (KDE), etc).

Parenting to bones can be much faster if you remove IK from the skeleton.
So first make your animations using IK on the feet and hands, then go back and add keyframes for visible transformations of the thigh, forearms etc… Finally remove the IK modifier. Now if you parent an object to any fo the bones it will have less impact on scenegraph.

Use the BGE IK solver

I’m not sure what you mean by this.

One other thing which can cause logic slowdown is using MP3 sounds. This causes a big drain over WAV files when you have a sound which is played several times a second such as a machine gun sound which is added every time you fire the gun. The MP3 in this case is perhaps 5 times slower than using a WAV file.

To explain, when using a gun in game I usually have it add an object with a sound attached. Each time you fire the object is added and the sound plays, this works well when you are using a gun sound which has a play length longer than the interval between shots.

However if this sound is am MP3 the logic counter says 0.37ms while with a WAV it’s only 0.08ms. That may not seem like a lot, but it also inlcudes all the logic for shooting, including rays, object placement, material detection and stuff like reloading, ammo count etc…
(without sounds it uses 0.02ms of processing time).
If I have 10 enemies we should multiply that 0.37ms by ten, and if I’m using my laptop we have to start looking at a real noticable slowdown.

When you start profiling, you may be tempted to let a 0.29ms difference go by, thinking it doesn’t make a difference, but there are hundreds of parts to a game and if you can shave a little time off of each part then you’re going to get a much more playable game.

Ah, I found the BGE IK solver and it does run a little bit faster.