Optimize Python Scripts?

What are some good ways to optimize python scripts performance wise?

Have smaller loops, Do less calculations, use less pulse mode.

only run what you need, when you need it.

store variables so you dont need to refind them, eg, game objects, scenes, textures, etc. classes are a good way to store lots of references in an orderly fashion.

i dont recommend setting pulse mode skips to anything but 0. running a heavy script every 5 frames will only make the game stutter obnoxiously.

§1: No code is faster than any code

This belongs to Python code as well as native code.

The first place to check are the sensor bricks. Try to avoid to trigger python controllers when there is nothing to do -> check the sensor settings. [True level triggering] is a typically issue.

This is not always possible and depends on the situation. These checks might eat more processing time than they save.

Exit Python code as soon as you know there will be no effect. E.g. when you want AND behavior you can exit right after the first sensor that has no positive evaluation state. Functions can help quite a lot as you can exit them via return statement. To avoid hard-to-read code I suggest to use module mode rather than script mode when using functions.

Create variables when you need them. There is no need to grab the scene when you do not use it in your code or just under specific conditions (Hint: move it at a place right before using it). Remark: Sometimes other aspects (readability, avoidance of code replication) can override this recommendation.

Code inside a loop should be as fast as possible -> no code is always faster ;). As more iterations the loop has as longer the processing will take. This quickly stacks up!

Dictionary access is faster than list search (a loop thing again).

Caching data is faster than retrieving it everytime again. But the cached data might by dirty. Ensure it is up-to-date and that keeping it up-to-date does not eat the saved processing time.

Always consider timing. You describe processing at a single frame. A frame has limited time to complete (not just your code). A single long lasting frame might be acceptable, many long lasting frames are not.

store variables so you dont need to refind them, eg, game objects, scenes, textures, etc. classes are a good way to store lots of references in an orderly fashion.

So you mean like, have a script (or class, like you said) with all the scenes, objects, textures, and stuff defined, then reference that information form other scripts?

i dont recommend setting pulse mode skips to anything but 0. running a heavy script every 5 frames will only make the game stutter obnoxiously.

Shouldn’t setting it to 0 make it run on every frame, and setting it to 5 make it run every 5 frames? Wouldn’t setting it to 5 be easier on the game because it would only have to run the script 12 times a second (assuming the game was running at 60fps) as opposed to 60 times a second?

Caching data is faster than retrieving it everytime again. But the cached data might by dirty. Ensure it is up-to-date and that keeping it up-to-date does not eat the saved processing time.

How do I cache data? Or is that the same thing Daedalus_MDW was talking about?

It’s effectively the same thing.

The basic principle is this; the less code you run, the faster your game will run. This might be meaningless in terms of performance - most small scripts won’t be significant.

There is also a general concept of performance - you’ll see O(n) notation.

Nested loops are slow because generally you’re doing an O(n^2) operation (not always, more like an O(m*n)), which given that speed ~ the number of operations you perform, it’s quadratically slower.

I have a script, (state machine) that runs


If "ActStrip" in own:
    Function = Dictionary[own['ActStrip'][0][0]]
    Function(own['ActStrip'][0][1])
    

This is a non branching state machine,
Where each entry in the action strip is

[“FunctionKey”, [dataForFunction]]

But does the dictionary building cost each frame cost much?

Would it be more effective to only build the dictionary 1 time and save it as a prop?

And second question, How does one profile a function?

“just because you can, doesnt mean you should!”

this seems to be a major issue nowadays. we have big computers with lots of processing power, and optimizing simply isnt as necessary. so programmers dont care to make things run faster. if you’ve ever been to an app store, you will lots of comments regarding apps wasting battery.

i like to treat every ms of time as if its my last.

BPR, store your dictionary in the logic module so you dont have to rebuild it. yes, the performance gain may be insignificant, but its the thought that counts. learn to make things as fast as possible from the start.

## Core.py, all scripts should import this to build variables ##

from bge import logic

logic.gameDict = {}  # create a new dictionary to hold data, like the globalDict, but doesnt get saved

logic.gameDict["Variable"] = expensiveFunction()

@BPR, I don’t know what you are talking about. I haven’t done anything with dictionaries yet, I’m still kinda new to python, maybe I should learn a bit more then ask this question again later.

@Daedalus_MDW, were you talking to me?

“just because you can, doesnt mean you should!”

I asked you:
Shouldn’t setting it to 0 make it run on every frame, and setting it to 5 make it run every 5 frames? Wouldn’t setting it to 5 be easier on the game because it would only have to run the script 12 times a second (assuming the game was running at 60fps) as opposed to 60 times a second?

Wouldn’t making the script run 12 times a second be faster than 60 times a second? Or am I wrong on how this works?

Running complex script less often = better frame rate.
I usually use a delay of about 1 (30 times per second) or 2 (20 times per second).

Running module mode works faster than standard script as well.
Lastly if you don’t want your script to activate until 1 event happens, it is a better idea to use a (property) sensor to activate the script instead of constantly checking a property.

Dictionaries have O(1) retrieve time, so its much faster than trying to loop through a list and find an item (if you don’t know the index).

Imagine you have an elevator which travels up and down a building. You stop at every floor to see if anyone wants to get on and off.

This is how most beginner’s python script works.

One way to make it faster is to only stop at odd numbered floors, or skip 5 or so floors at a time. This speeds it up but risks missing some passengers.

So the passengers will group on the target floors and wait. Now you have bottlenecks at the target floors.

The best idea is to install call buttons at every floor. When someone wants to ride they call the elevator.

In python terms this is a message system. Your main code should be waiting for messages and only running code as it is needed. This is the most efficient structure you’re likely to find right now.

Another improvement is to use states. An agent needs to move, navigate, shoot, jump, pick things up, open doors, die, talk and any number of other things. But it doesn’t need to do all these things at the same time. Using states allows you to focus on just one task, only checking if you need to change state and do something else.

Do checks in the correct order. My agent needs to navigate to a distant location, but only of he’s still alive. Check that first before checking for navigation.

I had a really slow game once and it turned out that even dead agents were still trying to move and shoot and doing all the calculations required of a living agent.

An obvious point if you think about it but easy to miss.

Assuming all other processing eats the same time, the game will eat less time for processing. This does not necessarily mean it runs faster. It needs less processing time over a longer period (at least 5 frames). Explanation: It is much easier to design a game that runs at constant frame rate. Therefore you get “idle times” at each frame. The idle time can be consumed by any processing as longs as it is available.

You need to consider: Makes it sense to run the code every 5 frames only?

Example: the code finds the closest object. The result can change multiple times within these 5 seconds. What if you need the result at the second frame?

It can be useful to reduce the number of triggers (e.g. how often an NPC checks if it sees an enemy) but often it is not good (imagine a door checks ever 10 seconds if your tried to open it).

The Logic Bricks provide a way to start processing on event. This moves the “constant” processing to native code focusing on event checking only. This is pretty fast and you usually do not need to worry much.

No, a script either runs once or never per trigger per frame. The duration of the script completely rely on your code.

The time over a period (> one frame) will change.

Ok, for each agent, each frame I was building 3 dictionaries,
now I am down to one each frame (movement speed, animations, etc change based on carrying vs sprinting vs walking)

I already shaved a whole ms off of 8 agents, so about .125 ms each unit :smiley:

I am thinking maybe just build a carrying dictionary, a sprinting one, and a walking one, this way I only build the dictionary 1 time.
I think this will shave off about .05 ms each agent per frame.

1 Like

First you request the property twice (this sample does it three times in a row).

First to check if it is present. Second to get the value.

This is fine when the normal case means the property is not present. Otherwise would be more efficient to request the property once. To deal with the unusual “not present” situation by dealing with an KeyError.


try:
    value = dictionay[key]
except KeyError:
    value = newValue
    dictionay[key] = value

This is information only. Typically the time difference is really small.

Second your code requires a real deep knowledge about the date structure.

“Function = Dictionary[own[‘ActStrip’][0][0]]”

[SUB]you mix upper and lower case variable naming. This is very confusing
[/SUB]
what can I read from here:

  • You assign a variable “Function” a value. I can guess it is a function (definition). I can’t guess what this kind of function is supposed to do or why it is in the “Dictionary”.

  • There is a “Dictionary”. I guess it is a dict (maybe). I can’t guess what it is supposed to contain, nor what it is good for not how it differs from other dictionaries.

  • there is “own” which we know is a typical variable to refer to a game object (the name is cryptic but we can accept it as known abbreviation). A game object is a dictionary too to provide property values by property keys. So we get the value of property “ActStrip” from the game object owning the currently running Python controller.

  • property “ActStrip” I have no idea what it is supposed to contain (sounds a bit like Reeperbahn).

  • property value of “ActStrip” seams to be a list. As there is no further information (not in this context) I can’t tell what it means. I can tell you expect at least a single item in that list.

  • The items of the property value is another list. No further information either. You expect at least two items in that list.

Lets look at the efficiency (assuming there is no keyerror).


If "ActStrip" in own:
    Function = Dictionary[own['ActStrip'][0][0]]
    Function(own['ActStrip'][0][1])

expands into:


If "ActStrip" in own:
    propertyValue = own['ActStrip']
    innerList = propertyValue[0]
    dictkey = innerList[0]
    Function = Dictionary[dictKey]

This results into 5 list/dictionary accesses (be aware each of them can fail with KeyError) just for this code. It is not that much.

Dictionary and list access is usually very fast (list access = O(n), dict access = O(log(n))). I’m not sure how it is implemented with game properties. As there shouldn’t be much properties it does not matter that much.

The above code does not sow how you build the dictionary. So I can’t guess how much time it eats. I do not know how many time you build the dictionary. I do not know how many time you need to update the dictionary either, because I do not know why you have it present.

More efficient compared to what?

Remark:
Python modules and Python classes are dictionaries already. They can contain functions as well as other data. You might consider that before inventing your own “function container” ;).

You measure the time when entering and the time when leaving the function. I’m sure there are libraries that support you on that.

imagine the elevator example. say you want to go to floor 100. if the elevator stops on each floor, it may take longer in total. but this is not what we care about when coding games. we care about how long the elevator stops at each floor. even if the elevator stopped only 5 times, it would still be annoying.

a 30ms script will take 30ms no matter what. if you run it every frame, you will get a constant 30ish fps, if you run it once every 10 frames, it will micro stutter, which, i think, is far more annoying.

its best to run it once, then save the result in a variable.

each agent uses functions as states,
an actstrip can be a list of these states

ActStrip = [ [‘navigate’,[Target]],[‘Press’,[target]],[‘Dialog’[dialog]] ]

walk up to button, press button, talk about what happened when you pressed button.

so an actstrip is a list of states.
now navigate,

can append [‘navigate’,[ladder], [‘climb ladder’,[target]] before seek target,

basically we end up with a ‘chronological state machine’

mose enemies never have more than 1 or 2 actstrip,

however to tell stories /act I will be using 16x long actstrips at a time etc.

I was able to shave off a bunch of logic time, however I think that perhaps module mode would be faster yet.

it turned out, the big spikes are -> punch sound * 8 and hit sound *8

Theorem #1: Tell, don’t ask
Callbacks vs polling. Polling is simpler, callbacks can be neater and faster. This is particularly relevant to action driven models - such as GUI’s. You can poll the mouse to see if it’s over the right object (every GUI element uses a mouseOver sensor, or does a raycast), or you can use a callback that runs whenever the mouse is over the object. Then it requires only a single ray cast which is far, far faster.

Other examples:

  • Polling to see if the players health is at zero, vs checking only each time it is changed and running callbacks if it is
  • Iterating through all elements in a list looking for the one that matches a criteria vs looking it up in a dictionary or database structure.
Theorem #2: Cache like crazy While other people say "store it in a variable," I'd suggest using an actual cache. These are designed for this, and are often faster than anything you can implement on your own. A simple example:

import functools

@functools.lru_cache(maxsize=128)
def some_slow_function(argument):
    time.sleep(1.0)
    print("Running slowly")
    return argument + 1

This, of course, only works on functions where things do not change between executions. So for pathfinding on static meshes, or for really crazy cellula automatainfinite worlds, it works beautifully. For other things, such as changing the texture of an object, well, it won’t actually do anything…

Some things you wouldn’t consider to be caches are actually caches. A map is a sort of cache, so is a K-D tree. Many datastrutures exist to try and get lookups happening faster - pick one that makes sense for your situation, but be aware that building said datastructure is often slow.

Theorem #3: Load Screen
99…999% of BGE games do not have a load screen, and people have no concept of how or why they would want one. But, this is used to prevent stuttering in games. Use this time to preload all sounds from disk, load up all the textures (if you’re going to be doing dynamic texture switching), and a dozen other things that take time to do.

BUT, MOST OF ALL:
Premature Optimization is the root of much evil
If it isn’t a problem, don’t fix it.

Don’t change the implementation, change the algorithm
No matter how fast you make a bubble sort, it will still be slower than a quicksort. Dijkstra is nearly always slower than A*, polling is nearly always slower than an even driven model. So before you start removing class attribute lookups because you read that the dot notation is slow, change the algorithm to a faster one. Unless you’re doing something silly such as inverting a matrix each frame (cache it), changing the algorithm is much better than anything else you can do.

I use a mix kinda,

players evaluate many conditions if run!=0 or movement sensor is active

run!=0----------python
movement------/

so I use events to wake the actor, and use polling when awake, and fall back into idle.

hitting a actor can insert a actstrip (ouch) and set run to 15 (enough time to play it)

Thanks guys!