Optimize now - don't optimize later

(sdfgeoff) #1

How many times when following a game project have you heard something like: It runs poorly, but it’s early in the project, and we’ll optimise later. I’ve heard it lots of times, including from quite experienced teams. But you know what? I’ve never seen that optimisation arrive. Hopefully this post will give you some ideas about what to do if your game isn’t performing well. While this was written with BGE in mind, this advice applies to most engines.

Any computer science graduate who knows a bit of management can tell you that “optimizing later” can be done is … a lie.
Management Student The later in a project you make a big change, the more expensive it is to make the change
Computer science student: Optimization comes predominantly from changing the algorithm, not from lots of tiny improvements.
You add those together and you realise that optmization typically required large architectural changes which will be harder to do later in the project. Examples:

  1. Our AI used to run everything in one frame, but it’s too slow

  2. How the heck to I split it over multiple frames?

  3. The VRAM usage is way to high, but we need high res textures for the close up detail

  4. How the heck to I stream textures in and out?

Implementing texture streaming may be easy at the beginning of a project, but to shoehorn it in at the end is downright impossible. And if you write an AI that can run over multiple frames, it’s trivial to make it run in one. But to split it up is hard.
If you solve the problem now, then you’ll only have one AI to port to the new system and only a few models to change. If you do it at the end of the game you’ll have to rewrite all your AI’s and redo all your textures.

So what does “we’ll optimize later mean”
Whenever I hear that expression, I read it as one of:

  • We’d like it to run better, but can’t be bothered
  • I can’t find any profiling tools and have no idea why my game is so slow
  • I thought if I just through some models and logic together, it would just work. What is an NP complete problem? Who’s the travelling salesman?

If you’ve optimised all you can, and it’s still slow, then maybe you’re barking up an NP complete problem and there is no solution But otherwise I equate the choice to optimize later with lazyness and unfamiliarity with programming. If your game runs slow on your target hardware, you’re stopping people from being able to play your game. I’d call that the most important thing in your game. Regardless of how good it looks or how fun it might be, if they can’t play it at a sensible frame rate, they can’t experience any of that.

***So What do you do when your game is running slow?
***Expect it to happen
Plan the optimization systems before you even start. If you know you’re going to need a lot of geometry, start out with an LOD system. If you know you’re going to have 1000’s of AI’s, figure out some way to deal with that before you even begin. Plan these systems rather than hack them in later.

In particular, plan a way to identify what is running slow. Build some way to profile your scripts. Make it easy to drop models in/out so you can identify if they are the issue.

Solve it as soon as it’s an issue
As soon as your game isn’t performing as well as you hope, figure out why and fix it. The earlier you fix it, the easier it will be and the less rework of models/scripts will be required.

***Finding what the issue is***The first step to solving an optimization issue is finding out what is causing yoru game to run slow. The first step is BGE’s profiler. Generally you’ll see:

  • The physics is really high
  • The logic is really high
  • The rasterizer is really high
  • The GPU latency is really high
  • The animations are really high

I’ll briefly pass through these in turn. This isn’t a be-all end-all list of issues and solutions, but hopefully it will give you things to google and pointers about what may be going wrong.

The Animations are really high
Animations are dependant on:

  • The polycount of your model
  • Any constraints attached.
    So to get it running faster, reduce the polycount, the number of actors using animations (ie don’t play animations for actors far away) or reduce the complexity of the constraints.

The Rasterizer is really high
You’ve probably got too many objects in your scene and need some sort of LOD system that removes them when they’re too far away.

The GPU Latency is really high
GPU’s are super fast. Even an integrated card can add numbers together faster than the CPU. However there are a few caveats:

  • VRAM access is slow (ie texture samples). So reduce the number of textures per material. In specific be aware of “heightmaps.” They take three texture samples, but a normal map takes only one. AN intel integrated card can only do 5-6 samples per pixel at 1080p.
  • Boolean logic sucks (if/else and case/switch statements) are really slow.This is due to the parallel architecture of GPU’s. Multiplying by zero and adding anyway can be faster than an if/else.

If your game was running at 60FPS, and then you added a new model and now it’s at 5FPS with really high GPU latency, you’ve probably run out of VRAM, and the system is trying to get textures in and out of the system memory. Textures take a lot of space inside a GPU’s vram. a 1024x1024 pixel texture will always occupy 30Mb of VRAM. It doesn’t matter the filesize on disk. This is what it will uncompress to on the GPU. As a result, if you only have 386Mb of VRAM (ie an intel integrated card), then after only ten textures at 1024x1024, your game will run super slow. The solution? Reduce your image pixel counts.
I believe vertices also reside in VRAM, but we tend to be better at counting them. A 2048x2048 pixel texture takes up about the same VRAM as a 1 million poly model… But a million poly model will run slowly due to all the matrix math, so go easy on the polycount if your GPU latency is high.

There are tonnes of tools for analysing GPU usage. Go pick one ans find out what your bottleneck is. In my experience, unless you’re above you can be pretty confident that that it’s texture sampling or VRAM usage slowing you down (unless you have more than a few million polygons). GPU’s can add vectors and matrices together at an absolutely mind boggling speed.

Logic is really high

  • Don’t use the ‘near’ or ‘radar’ sensor, they are super slow - as they check … per vertex. Even ray casting is pretty slow. If you’re doing 100 ray casts in a grid, consider rendering out a depth texture and sampling it. Rasterizing is faster than raytracing.
  • Run a profiler on your scripts.
  • Be aware that dynamic textures (eg ImageMirror, ImageFFMPEG) all appear inside logic because they run inside the script. Also be aware that you calculate animations per render camera, not per frame. So if you have three cameras, each animation is getting applied three times and all appearing inside the logic graph.
  • Look for disk IO, which is super duper slow on Windows, but tolerable on Linux.
  • Be aware of NP-complete problems, and don’t be afraid to discard “optimal” for “good enough”
  • Be aware of cache-thrashing if you have large arrays (particularly 2D ones). Sampling a list in order is much faster than sampling it randomly.
  • cProfile in python is really really useful if you have a single entry point (ie only one or two Always -> Python logic bricks)

Physics is really high

  • In order, use: Sphere, Capsule, Cylinder, Cone, convex hull, triangle mesh. I’ve left out box because in my experience, it is super unstable.
  • I’ve never has issues with constraints appearing here (and I’ve run up to a hundred or so at a time), but I suppose it may be an issue
  • Reduce the resolution of a convex hull/triangle mesh primitive
  • Set all objects that you can to “No Collision”
  • Don’t use Ghost, use collision layers. Intersecting Ghost objects eat huge amounts of processing power.
  • Use big margins
  • Watch this videoon convex decomposition and get rid of those triangle meshes.


  • Don’t compile shaders each frame. It will start out fine but degrade your performance over time.
  • If it’s fine on Linux and slow on windows, look for disk IO. Linux caches things much better.

***If it takes a long time to start your game***The thing that takes the longest when loading your game is probably either:

  1. Disk IO - pack as much as you can into each blend to avoid reading lots of smaller files. Reduce model complexity. Try compressing the blend when saving it (but beware #3).
  2. Compiling the shaders (if you have an integrated card) - use simpler shaders
  3. Decompressing your blend files (if you have enabled compression when saving and have a slow CPU) - don’t compress your blends (but beware #1).

Well, hopefully this inspires you to get your game running fast.

(BluePrintRandom) #2

one solution for ai pathfinding, is to have a manager whom assigns paths to the ai , and the ai runs the path until it’s updated or reached it’s goal,

we can than use

for Num in range(value):
    ai = aiList[own['index']]
    if own['index']>len(aiList):
        own['index'] = 0

one issue I am having, is I need a compiled node based A* because I am adding / removing nodes each frame.

(pgi) #3

I do not agree with (perhaps my way to see) the premise of this post.
The reason behind the “optimize later” strategy is not magic - although it is often described as a mystical mantra in our age of pop-culture programming - the reason is logic.

First though, we must agree on what “optimization” is.
I define optimization as the process through which we reduce the amount of resources needed by a program to carry on its task.

If we take that definition - which is somewhat arbitrary but I think reasonable - then we have a problem.

To optimize a program we must first measure the resources it uses and to make that measurement we need a program that works.

Let’s think about Game Engine Slow and Game Engine Fast.
Game Engine Slow can take one textured triangle and draw it at one frame per second.
Game Engine Fast can take one billion textured triangles and draw it at a thousand frame per second.
There’s a catch though: game engine fast randomly crashes due to some undiscovered bug.

Can I define Game Engine Fast as the optimized version of Game Engine Slow? The answer is no, because I can only measure the required resources in the program that works.
If the program crashes, I can’t guarantee that the measurement i can make isn’t altered by the presence of the bug that makes the program crash - and that will have to be removed, invalidating the measurement.

That’s why we “optimize later”. As a matter of fact, it is not the case that we “optimize later”: we design the program and implement it in the way we’re more confident will give us the correct output.
More often than not, we use sub-optimal models and algorithms because those are then ones that, to the extent of our knowledge, produce the correct output.

Why not learning the fast stuff beforehand? Well, probably for the same reason why promised optimization never comes, which, in my opinion, is our old good friend Time.
By the time we end up having a program that (sort of, especially in the game industry) works, it’s time to pay the bills, at which point you have to decide if you want to be fast or homeless :D.

It would also be interesting to discuss about the extension of the changes that might make an optimization too costly - a subject that, in my opinion, should be handled as a design issue - but I won’t bother you to death with it :D.

(Fred/K.S) #4

Thats why ever since i started production with my Game Optimization has been my main Focus even until Now !!!
My Game runs well on the Low-End PC’s its the Gameplay im busy polishing otherwise optimization has paid off well i like this thread because it recaps me when ever i start working on my game !!!


(sdfgeoff) #5

I agree that running stably is more important than running fast. As such:
Stability > Speed > Features

The problem is that many indie devs do:
Features > Speed > Stability

Where they always want something new to show to there followers, rather than having a product that actually runs properly. They say: “We’ll get our game/program feature complete, and then make it run well,” and in my mind that is the wrong way to approach the problem. Make it run well with a stable base, and then add the features. As soon as there is a known issue (performance, stability etc.) either fix it immediately or finish what you’re doing and then fix it. Hold off adding features unless what is existing is working well.

(pqftgs) #6

Usually I’ll slap something together, document how it could be optimized, then come back to it when I’m no longer pushing a deadline. But that all happens in a fairly short cycle, not months after release/never.

Those “features first” games are SO FRUSTRATING. I’ve just stopped buying them. But then it goes on sale and a friend picks up 4 copies… damn you steam.

(JustinBarrett) #7

I do somewhat the same thing…cram a bunch of stuff together to get it working…then immediately follow up and clean up a bit + optimize…if I find an area of code with a lot of if else statements…I look to see if it is repeating similarly throughout the code and find a way to make one function do that…repeating code…this is normal programming though.
I also had major issues with my pathfinding and ended up re-writing it about 4 times…in the end I do not really use pathfinding but some clever checks and the only path I use is if the player ‘was’ seen building a list of his vector every quarter second or so…that is the extent of my pathfinding…One thing that helps me also is making notes daily of what I changed along with the times and current FPS…this allows me to go back and see where my changes affected the overall performance…it also forces me to look at the implementations in more abstract ways that are usually faster.
I also find that simpler is better…I am not a great coder or artist but my current game is running at 60fps with a 30K draw distance…to be fair, I do realize after all things are fully implemented it will likely chug down to 30-40 fps…and that is when I will start to do actual world building around the story and optimisation…so…I’ll get to it later :wink:
I find this is a great thread though. I feel a more serious failure in bge/upbge user is failure to follow through…to keep on working on their game…I honestly have not seen a complete grade A game come out of bge…I am not trying to offend anyone but it seems that most projects just end abruptly or people hit a wall in development and never look back…
again, great thread.

(alf0) #8

this is so helpful but I only didn’t understand something’s "So to get it running faster, reduce the polycount, the number of actors using animations (ie don’t play animations for actors far away) or reduce the complexity of the constraints."I am working on a small project for test, it’s a 3d person shoter thing, I am not a programer as I only use logic and simple game code, I made a scene where there is only 60 objects, most of them are static with box or trangle mesh, boundary, I used lod with different materials yet it’s still running at 5 to 10 fps

(sdfgeoff) #9

What does the profiler say about what is using the performance? It’s hard to identify the issue without a blend file.

One thing that helps me also is making notes daily of what I changed along with the times and current FPS

Hmm, that’s an interesting one. I hadn’t thought about storing the FPS over time. However this doesn’t really hold well for when you have lots of different scenes/levels. I wonder if I can build an automated test system that does this sort of thing.

I feel a more serious failure in bge/upbge user is failure to follow through…to keep on working on their game…I honestly have not seen a complete grade A game come out of bge

Here in the blenderartists forums we maintain a 6:1 WIP to finished game thread radio. The actual ratio is far worse, I don’t publish half my game prototypes because they suffer from underlying gameplay flaws.
I suspect most other engines are similar. How many unfinished games do you think litter the shelves of Unity? I know that many many people make many many failed prototype games in other engines.
It’s not only games either. I’ve started writing some 40 odd pieces of music and finished all of 4 of them. I’ve designed hundreds of robots, built 10 of them, but only developed the software for two or three. Where the ones not finished a waste? Nup, I learned tonnes through them. If you’re a hobbiest, there’s no pressure to finish, and to my knowledge there are very few professional game developers using BGE at the moment.

However, I do work with BGE full time, but, they’re not games. We do have two performance issues at the moment. One is a temperature issue (runs fine for 10 minutes then the target hardware overheats), and the other is the result of using the GE’s systems in ways they probably weren’t designed to be used. (Ever wanted to deform somewhere near half a million polygons in real time? Yeah…)

(JustinBarrett) #10

That is a valid point…unfinished games are part of any game engine community.