How many times when following a game project have you heard something like: It runs poorly, but it’s early in the project, and we’ll optimise later. I’ve heard it lots of times, including from quite experienced teams. But you know what? I’ve never seen that optimisation arrive. Hopefully this post will give you some ideas about what to do if your game isn’t performing well. While this was written with BGE in mind, this advice applies to most engines.
Any computer science graduate who knows a bit of management can tell you that “optimizing later” can be done is … a lie.
Management Student The later in a project you make a big change, the more expensive it is to make the change
Computer science student: Optimization comes predominantly from changing the algorithm, not from lots of tiny improvements.
You add those together and you realise that optmization typically required large architectural changes which will be harder to do later in the project. Examples:
Our AI used to run everything in one frame, but it’s too slow
How the heck to I split it over multiple frames?
The VRAM usage is way to high, but we need high res textures for the close up detail
How the heck to I stream textures in and out?
Implementing texture streaming may be easy at the beginning of a project, but to shoehorn it in at the end is downright impossible. And if you write an AI that can run over multiple frames, it’s trivial to make it run in one. But to split it up is hard.
If you solve the problem now, then you’ll only have one AI to port to the new system and only a few models to change. If you do it at the end of the game you’ll have to rewrite all your AI’s and redo all your textures.
So what does “we’ll optimize later mean”
Whenever I hear that expression, I read it as one of:
- We’d like it to run better, but can’t be bothered
- I can’t find any profiling tools and have no idea why my game is so slow
- I thought if I just through some models and logic together, it would just work. What is an NP complete problem? Who’s the travelling salesman?
If you’ve optimised all you can, and it’s still slow, then maybe you’re barking up an NP complete problem and there is no solution But otherwise I equate the choice to optimize later with lazyness and unfamiliarity with programming. If your game runs slow on your target hardware, you’re stopping people from being able to play your game. I’d call that the most important thing in your game. Regardless of how good it looks or how fun it might be, if they can’t play it at a sensible frame rate, they can’t experience any of that.
***So What do you do when your game is running slow?
***Expect it to happen
Plan the optimization systems before you even start. If you know you’re going to need a lot of geometry, start out with an LOD system. If you know you’re going to have 1000’s of AI’s, figure out some way to deal with that before you even begin. Plan these systems rather than hack them in later.
In particular, plan a way to identify what is running slow. Build some way to profile your scripts. Make it easy to drop models in/out so you can identify if they are the issue.
Solve it as soon as it’s an issue
As soon as your game isn’t performing as well as you hope, figure out why and fix it. The earlier you fix it, the easier it will be and the less rework of models/scripts will be required.
***Finding what the issue is***The first step to solving an optimization issue is finding out what is causing yoru game to run slow. The first step is BGE’s profiler. Generally you’ll see:
- The physics is really high
- The logic is really high
- The rasterizer is really high
- The GPU latency is really high
- The animations are really high
I’ll briefly pass through these in turn. This isn’t a be-all end-all list of issues and solutions, but hopefully it will give you things to google and pointers about what may be going wrong.
The Animations are really high
Animations are dependant on:
- The polycount of your model
- Any constraints attached.
So to get it running faster, reduce the polycount, the number of actors using animations (ie don’t play animations for actors far away) or reduce the complexity of the constraints.
The Rasterizer is really high
You’ve probably got too many objects in your scene and need some sort of LOD system that removes them when they’re too far away.
The GPU Latency is really high
GPU’s are super fast. Even an integrated card can add numbers together faster than the CPU. However there are a few caveats:
- VRAM access is slow (ie texture samples). So reduce the number of textures per material. In specific be aware of “heightmaps.” They take three texture samples, but a normal map takes only one. AN intel integrated card can only do 5-6 samples per pixel at 1080p.
- Boolean logic sucks (if/else and case/switch statements) are really slow.This is due to the parallel architecture of GPU’s. Multiplying by zero and adding anyway can be faster than an if/else.
If your game was running at 60FPS, and then you added a new model and now it’s at 5FPS with really high GPU latency, you’ve probably run out of VRAM, and the system is trying to get textures in and out of the system memory. Textures take a lot of space inside a GPU’s vram. a 1024x1024 pixel texture will always occupy 30Mb of VRAM. It doesn’t matter the filesize on disk. This is what it will uncompress to on the GPU. As a result, if you only have 386Mb of VRAM (ie an intel integrated card), then after only ten textures at 1024x1024, your game will run super slow. The solution? Reduce your image pixel counts.
I believe vertices also reside in VRAM, but we tend to be better at counting them. A 2048x2048 pixel texture takes up about the same VRAM as a 1 million poly model… But a million poly model will run slowly due to all the matrix math, so go easy on the polycount if your GPU latency is high.
There are tonnes of tools for analysing GPU usage. Go pick one ans find out what your bottleneck is. In my experience, unless you’re above you can be pretty confident that that it’s texture sampling or VRAM usage slowing you down (unless you have more than a few million polygons). GPU’s can add vectors and matrices together at an absolutely mind boggling speed.
Logic is really high
- Don’t use the ‘near’ or ‘radar’ sensor, they are super slow - as they check … per vertex. Even ray casting is pretty slow. If you’re doing 100 ray casts in a grid, consider rendering out a depth texture and sampling it. Rasterizing is faster than raytracing.
- Run a profiler on your scripts.
- Be aware that dynamic textures (eg ImageMirror, ImageFFMPEG) all appear inside logic because they run inside the script. Also be aware that you calculate animations per render camera, not per frame. So if you have three cameras, each animation is getting applied three times and all appearing inside the logic graph.
- Look for disk IO, which is super duper slow on Windows, but tolerable on Linux.
- Be aware of NP-complete problems, and don’t be afraid to discard “optimal” for “good enough”
- Be aware of cache-thrashing if you have large arrays (particularly 2D ones). Sampling a list in order is much faster than sampling it randomly.
- cProfile in python is really really useful if you have a single entry point (ie only one or two Always -> Python logic bricks)
Physics is really high
- In order, use: Sphere, Capsule, Cylinder, Cone, convex hull, triangle mesh. I’ve left out box because in my experience, it is super unstable.
- I’ve never has issues with constraints appearing here (and I’ve run up to a hundred or so at a time), but I suppose it may be an issue
- Reduce the resolution of a convex hull/triangle mesh primitive
- Set all objects that you can to “No Collision”
- Don’t use Ghost, use collision layers. Intersecting Ghost objects eat huge amounts of processing power.
- Use big margins
- Watch this videoon convex decomposition and get rid of those triangle meshes.
- Don’t compile shaders each frame. It will start out fine but degrade your performance over time.
- If it’s fine on Linux and slow on windows, look for disk IO. Linux caches things much better.
***If it takes a long time to start your game***The thing that takes the longest when loading your game is probably either:
- Disk IO - pack as much as you can into each blend to avoid reading lots of smaller files. Reduce model complexity. Try compressing the blend when saving it (but beware #3).
- Compiling the shaders (if you have an integrated card) - use simpler shaders
- Decompressing your blend files (if you have enabled compression when saving and have a slow CPU) - don’t compress your blends (but beware #1).
Well, hopefully this inspires you to get your game running fast.