How to write clear, expandable Python code

Many people on the forum use Python; It’s lightweight, versatile and reasonably easy to pick up as programming languages go. As users of the Game Engine learn more about its inner workings, they are often tempted (and rightly so) to pick up the power that the API can provide.
However, this has its pitfalls.

After progamming for a couple of years, from the basics to the complex objects, I have learnt many lessons. It became apparent to me a short while back that the biggest problems I faced within the engine were mostly derived from poorly constructed code.
In order to highlight my own issues, please allow me to list and correct the most common mistakes.

  • Hard coded variables, objects names and parameters
  • Linear processing (avoiding objects or methods)
  • Inconsistent naming conventions
  • Inefficient function calls

Hard Coding
In the beginning, most users learn Python because there is a certain task, such as a complex Inventory, that would be far easier to programme than to hard code in Logic Bricks.
As they are introduced into the Python workflow, they often omit the core documentation about the Game Engine Workflow, which they are unlikely to have experienced following the Logic Brick route.

The most typical example is when sample code is provided to access objects in the scene - they will see

scene = bge.logic.getCurrentScene()

without understanding how that actually comes to be. They derive objects from a list of objects in the scene - showing how objects belong to a scene, but they often do not learn this until later on. Therefore, we often see hard-coded object references in scripts of beginners, because that is presented as the only way of accessing objects. The user does not often understand the concept of objects represented as Python objects, and references stored on Children, Parents and so forth.

The same sits for parameters. Hard coded parameters often crop up because they are seen as the only solution, maybe because they are the first solution presented to a problem.

It is important that we correct this. In some cases, hard coding object names will be inevitable as a Python script will at some point need to know about its own scene. However, there are ways of doing this. Always think about your script in a global scope - what is vitally important to stay constant in the script, and what can change depending on usage scenarios?
For the former, hard coding object names may be necessary, but one can use object variables for that, accessed from the script controller’s owner. For the latter, the term “dynamic” applies - enabling a script to traverse object relations and perform calculations using functions which you can pass arguments, removing all elements of defining everything.

We can access variables from an objects using dictionary access. However, often a variable may not exist if the user is sharing code with another user, who hasn’t set up correctly. At this point, you need to consider its value. If the variable is vital to the running of the code, like the name of an object in the scene, then you want the script to fail and display an error. However, if the variable was just a parameter, like the speed of an object, you can often assume constants. Here are the two usage cases in code:


import bge

cont = bge.logic.getCurrentController()
own = cont.owner

# 1) Variable is crucial to script's operation:
try:
    variable = own['variable']
except KeyError:
    print('The variable did not exist')
    bge.logic.endGame()

# 2) Variable can be set to a default value
variable = own.get('variable', 100)

1) The first example expects the variable to exist. If it does not, it fails, and calls an Exception, and thus the interpreter prints the error message to the console, and the game ends “gracefully”, with one error frame.
2) The second example doesn’t mind if the variable doesn’t exist; it just carries on with a default variable, silently to the user.

Linear Processing
Many beginners in python are shown a limited scope. By this, I mean that is unlikely that they will have been shown complex examples of any code, because they wouldn’t have been able to understand it, and so everyone’s time would have been wasted. This typically means that they do not learn about Functions or Objects.
This isn’t always an issue. In many cases, they may not need to perform a complex task. However, one does see the odd example when a user has copied the same code block three times, changing one line or parameter.
In such a case as this, one would want to use a Function, for clarity and efficiency.
Here is an example of a user mistake:


import bge

scene = bge.logic.getCurrentScene()

object_1 = scene.objects['Cube']
object_2 = scene.objects['Cube.001']
object_3 = scene.objects['Cube.002']

x, y, z = object_1.worldPosition
x += 1.0
object_1.worldPosition = [x, y, z]

x, y, z = object_2.worldPosition
x += 1.0
object_2.worldPosition = [x, y, z]

x, y, z = object_3.worldPosition
x += 1.0
object_3.worldPosition = [x, y, z]

Here the same code block has been repeated 3 times. The user could greatly simplify this with a function and a for loop:


import bge

scene = bge.logic.getCurrentScene()

object_1 = scene.objects['Cube']
object_2 = scene.objects['Cube.001']
object_3 = scene.objects['Cube.002']

def move_object(object):
    x, y, z = object.worldPosition
    x += 1.0
    object.worldPosition = [x, y, z]

for object in [object_1, object_2, object_3]:
    move_object(object)

Learning complex (or more complex) attributes to the language will greatly develop the users comprehension of the language, and open far more doors.

Inconsistent Naming Convention
This has to be one of the most irritating and destructive aspect of coding, in any language. Naming conventions enable different programmers to look at someone’s work, and work with it, understand it, and offer improvements.
Naming conventions also enable the programmer to look back at their code at a later date, and understand how it works, and feel pride in the work and time they’ve spent.
Many surveys have shown that the majority of time a programmer spends (for which i can vouch) is spent on debugging code, and not writing it. This means that one should take as much care as one can when programming.
Let me first show you an example of poorly refactored code:


import bge.logic as gl

ow = gl.getCurrentController().owner
ow['X'] = 1
sc = gl.getCurrentScene()
sc.objects['Cube'].worldPosition.z += 0.1
sc = sc.objects['Cube']
sc.worldPosition.z-=0.1

The above code demonstrates two issues. Firstly, the variable names are too short. They don’t describe themselves aptly enough, especially to non-english speakers. Secondly. the variables are sometimes overwritten with completely different functions (sc becomes an object instead of a scene).
A third issue is that the variables are often inline rather than branched - multiple attributes on one line.

Here is some refactored code to a naming convention:


import bge.logic as logic

cont = logic.getCurrentController()
scene = logic.getCurrentScene()

own = cont.owner
cube = scene.objects['Cube']

cube.worldPosition.z += 0.1
cube.worldPosition.z -= 0.1


Now, some people may argue that even cont is too short to be an accurate description of its function, especially to non-native english speakers, however I have found that anything longer than cont (such as “controller”) is just a pain to write, and therefore “cont” is a compromise.

The first thing you ought to observe is that the second paragraph is “cleaner” or nicer looking. I would like to make an abstract opinion and argue that one’s opinion of one’s code shapes the quality and direction the code takes. If you don’t like how your code looks, you use it less effectively and give in more easily.

Naming conventions serve to solve this problem. Most of the user base for Python now use PEP#8, which is just one naming convention, though you don’t have to, you just need to make sure you pick one and stick to it.
For example, PEP#8 uses CamelCase for Classes, and lowercase_with_underscores for Functions.
Ever since I chose a naming convention, I enjoyed my code a lot more.

Inefficient Function Calls
Most of your time spent in Python will be writing the structure for code. In most cases, you shouldn’t need to worry about optimisation because the code is so light weight that it doesn’t cost that much more, and is not noticed if it is wasteful of resources. However, there will be times when you need to optimise code that you’ve written. Therefore, you need to know what is efficient.
In Python, the less the interpreter has to do, the more efficient.
Therefore, the below tips are of great relevance:

  • If you use an attribute (or a part of a module) regularly, set a variable to point towards it, rather than continually call the attribute.
  • When allowing for conditions, be sure to use the most efficient polling to save resources.
  • Avoid heavy calculations that are unnessecary.

1) Attribute reference
Attributes can also refer to the objects within a module. Whenever you see an “X.Y” call, that accesses an attribute.
If you regularly call an attribute of the same module, e.g bge.logic.getCurrentController(), it is best to make a pointer for it, as it is only called once!

cont = bge.logic.getCurrentController()

(one call, not three)

2) Conditions
Conditions are useful entities, as you can perform differing tasks depending upon programme input. However, they can be expensive if you always check for something that is rare, and can be catastrophic if an event occurs which isn’t expected.
If you wish to check for a rare condition, such as a if a user accidently misspells a dictionary key, it is better to expect the key to exist, but catch the exception and perform a different function if that code fails.
Here is an example:


user_dictionary = {
                 "James" : 15,
                  "Alice" : 16,
                  }

key = "James"

try:
    print(user_dictionary[key])
except KeyError:
    print("Oh gosh,", key, "doesn't exist!)


Rather than using dictionary.get() which is more expensive than expecting the key to exist.
However, if it is likely that the condition is not going to be met frequently, and there are more than one conditions, then using get would be a good solution, as it is more expensive to use an “if in” statement and return the value.

3) Heavy Calculations
Python can be quite slow at certain tasks.
It is very important at the more advanced level to be aware of what is “expensive” in terms of resources, and what isn’t. In may cases, the fixes can be simple. Regard the following code:


def my_function(number):
    return number * 2

for number in range(0, 1000):
    result = my_function(number)
    print(result)

This calls a function mulitple times, which has a rather large overhead. Instead, it is better to call the for loop inside the function:


def my_function(numbers):
    for number in numbers
        print(number * 2)

numbers = range(0, 1000)
my_function(numbers)

You can also use the aforementioned pointer tip within a loop:


my_list = []
append_to_list = my_list.append

for i in range(0, 1000):
    append_to_list(i)

Value relations
In python, all conditions return a Boolean output (True or False). It is a typical misconception that conditions return a Boolean value. This is not the case, in fact Boolean is just a derivative of an integer, who’s truth table is an integer of 1 or 0. However, this is not the case, because 1 has a Boolean.
An example:


value = True

print(value - 1)
# >> 0

However, this aside, Python always expects a True condition, and conditions return True or False.
So, many beginners believe that they must always use a comparator.
An example:


value = True

if value == True:
    # Execute code block

However, you do not need to perform a comparison against a Boolean value, because that itself returns the Boolean that Python uses to determine the action. (value == True # Returns True according to the example)
You can, in this instance, omit the ‘== True’:
An example:


value = True

if value:
    # Execute code block

This is both more logical, and easier.

Nesting etiquette
When writing any programme in Python, it is more than likely that you will want to have more than one condition within a code block. This is called ‘nesting’. Here is an example of nesting:


value_one = True
value_two = False

if value_one:
    if value_two:
        # Execute one code block

    else:
        # Execute other code block

However, you don’t have to nest statements all the time, unless you wish to catch the ‘else’ statement.
If you do not wish to catch the ‘else’ statement, then just use multiple conditions within one statement:


value_one = True
value_two = False

if value_one and value_two:
    # Execute code block

Very good post, thank you! I was reading this as you were posting and thought ‘…Yep, these are the mistakes you make Paul!’. Shame it cannot be pinned as it would be a handy reference for those beginning to learn Python.

Also read PEP8 and import this

Very good agoose!

Naturally this article can only scratch the surface of good programming. It should provide a good start.

I want to mention these small suggestions:
Better use the term “reference” rather than “pointer”. The pointer is a special form of a reference but easily produced a lot of problems because of missing validation. Therefore it is better to make clear that Python does not use (simple) pointers but references.

Heavy calculations are heavy regardless of the programming language ;). They will remain heavy even with another language even if they might be a bit faster. It is important to identify potential bottlenecks (cost estimation) and deal with them accordingly (e.g. by choosing a different design as your example shows).

…I wonder if it’s bad that my class/function naming convention is the inverse of PEP#8. .~.

Regardless- great resource! This says things I’ve always wanted to say about scripting in the BGE.

My two cents:
Actually, python variables are reference-counted C pointers underneath. Reference is just Python-speak for that :smiley: Which is kinda odd, because reference means something completely different in C speak. Honestly, all you have to do to get a crowd of experienced programmers of ANY language utterly confused is ask “Are arguments passed by value or reference?” Particularly Java.

My more useful dollar:
Python is an interpreted language so you basically want to do as little as possible with it. One thing I see beginner scripts doing a lot is manually adding vector components together. That’s what Mathutils is for. It will always be faster than doing it in Python because the underlying operations are in C.

The very last code example is a bit iffy IMO; I doubt it gives much of a speedup (if it gives one at all) and I find it more readable to have the array name in there before .append(). NOTE: That’s opinion only. It still does the job. :I

Thanks agoose, for share this! it ll help a lot the comunity.

no, it is not. It is your naming convention. As long as you keep it consistent it is fine :slight_smile:


import bge

scene = bge.logic.getCurrentScene()

object_1 = scene.objects['Cube']
object_2 = scene.objects['Cube.001']
object_3 = scene.objects['Cube.002']

def move_object(object):
    x, y, z = object.worldPosition
    x += 1.0
    object.worldPosition = [x, y, z]

for object in [object_1, object_2, object_3]:
    move_object(object)

is not the max of the semplicity and efficiency
without definition is much better


import bge

scene = bge.logic.getCurrentScene()

object_1 = scene.objects['Cube']
object_2 = scene.objects['Cube.001']
object_3 = scene.objects['Cube.002']

for object in [object_1, object_2, object_3]:
    object.worldPosition[0] += 1

and without loop also better (only one line long)


import bge

scene = bge.logic.getCurrentScene()

object_1 = scene.objects['Cube']
object_2 = scene.objects['Cube.001']
object_3 = scene.objects['Cube.002']

object_1.worldPosition[0] +=1
object_2.worldPosition[0] +=1
object_3.worldPosition[0] +=1

in this case.

Marco please do not confuse “max speed” with clear code.

As long as you do not need to count each single milliseconds (as for high performance applications) it is better to write clear and readable code. If you really need that you should switch to a low level language or at least use a compiler.

The above examples do not even perform exactly the same processing. The final result might be the same but the processing is different. Which means you can’t compare them.
If you replace the function and the loop from the first code snippet you would get this:


import bge

scene = bge.logic.getCurrentScene()

object_1 = scene.objects['Cube']
object_2 = scene.objects['Cube.001']
object_3 = scene.objects['Cube.002']

x, y, z = object_1.worldPosition
x += 1.0
object.worldPosition = [x, y, z]

x, y, z = object_2.worldPosition
x += 1.0
object_2.worldPosition = [x, y, z]

x, y, z = object_3.worldPosition
x += 1.0
object_3.worldPosition = [x, y, z]

The resulting performance win is not really noticeable. The code gets hard to read and hard to maintain.

The other way around:
The code snippets 2 and 3 replace this code with a function:


import bge

scene = bge.logic.getCurrentScene()

object_1 = scene.objects['Cube']
object_2 = scene.objects['Cube.001']
object_3 = scene.objects['Cube.002']

def move_object(object):
    object.worldPosition[0] +=1

for object in [object_1, object_2, object_3]:
    move_object(object)

A one line function makes sense if

  • it is used at different places or
  • it’s name is a better description what it means (especially if the code line is not easy to understand).

While your code snipped #3 indeed looks cleaner it is still inflexible. Imagine you need a fourth object. You have to redesign the whole code.
It is a simple task by copy and paste.
But this is error prone as

  • you can easily forget to change a name after copy
  • you replicate the same code again and again. If you need to change something at the code (e.g. worldPosition to localPosition) you have to change all occurrences which are not just one (as with loop or function) but there are four of them.

In some situations it can be worth to replicate code (e.g. the two axis of the Mouse Cursor Position) but in most cases it is not.

In the issue of


x, y, z = object.worldPosition
x += 1.0
object.worldPosition = [x, y, z]

vs


object.worldPosition[0] +=1

the second is logically better in the circumstances. Size (one line), readability (single operation vs list unpack/repack), and execution speed (in-place operation = fast). That much is easy to argue.

“The final result might be the same but the processing is different. Which means you can’t compare them.”

Why wouldn’t you compare them? If people didn’t compare the performance of two algorithms to see which was the faster, we wouldn’t get anywhere.

“Imagine you need a fourth object. You have to redesign the whole code.”

This is a little bit biased, isn’t it? In a larger system I would agree, and particularly if the operation were more complex, but in terms of this snippet, no. There is no essential redesigning, just another two lines of code for the extra object.

“without definition is much better”

By which I assume Marco means, not packaging the operation up into a function is better. I can see some reasons why.

"A one line function makes sense if

  • it is used at different places or
  • it’s name is a better description what it means (especially if the code line is not easy to understand)."

It isn’t used in different places, and it can be reduced to a single easy-to-understand statement. Those are the reasons why I would agree with not using a function.

Whether an operation is worth putting in a function is a big question and it is always dependent upon context, and also the programmer’s preference. Sometimes there is no logical choice, just personal opinion. In the case of this small snippet, it is Marco’s preference not to use a function, and I would agree with him on logical grounds. If the operation became the slightest bit more complex (say, another line) I would probably use a function.

Next issue, unpacking the loop. As there are only three elements I personally wouldn’t use a loop. This is because I find a loop to be messy-looking for such a small hardcoded list. If there were more than three (or if the objects being manipulated had to be changed dynamically) I would use a loop. Or, if I had to perform multiple operations on the same three objects, that is, operations that couldn’t or shouldn’t be packaged up into a single function operating on the object, then I would use a loop. That is where your arguments for code maintenance and the problems with copypasting are valid, because maintaining the copypasted operations would become harder in the long run than configuring a loop.

However, again, the point at which unrolling becomes too tedious is defined by a mixture of opinion and logic. Something one programmer finds easy to maintain another might find irritating. Fortunately this doesn’t happen very often, and it can usually be resolved by a bit of communication.

If I were coding matrices in Python, I wouldn’t unroll anything. But in C++, and because of my mild OCD… (and just to space out the conversation)


mat4 mat4::operator*(mat4 o)
{
    mat4 r;

    r[0][0] = v[0][0] * o[0][0] + v[0][1] * o[1][0] + v[0][2] * o[2][0] + v[0][3] * o[3][0];
    r[0][1] = v[0][0] * o[0][1] + v[0][1] * o[1][1] + v[0][2] * o[2][1] + v[0][3] * o[3][1];
    r[0][2] = v[0][0] * o[0][2] + v[0][1] * o[1][2] + v[0][2] * o[2][2] + v[0][3] * o[3][2];
    r[0][3] = v[0][0] * o[0][3] + v[0][1] * o[1][3] + v[0][2] * o[2][3] + v[0][3] * o[3][3];

    r[1][0] = v[1][0] * o[0][0] + v[1][1] * o[1][0] + v[1][2] * o[2][0] + v[1][3] * o[3][0];
    r[1][1] = v[1][0] * o[0][1] + v[1][1] * o[1][1] + v[1][2] * o[2][1] + v[1][3] * o[3][1];
    r[1][2] = v[1][0] * o[0][2] + v[1][1] * o[1][2] + v[1][2] * o[2][2] + v[1][3] * o[3][2];
    r[1][3] = v[1][0] * o[0][3] + v[1][1] * o[1][3] + v[1][2] * o[2][3] + v[1][3] * o[3][3];

    r[2][0] = v[2][0] * o[0][0] + v[2][1] * o[1][0] + v[2][2] * o[2][0] + v[2][3] * o[3][0];
    r[2][1] = v[2][0] * o[0][1] + v[2][1] * o[1][1] + v[2][2] * o[2][1] + v[2][3] * o[3][1];
    r[2][2] = v[2][0] * o[0][2] + v[2][1] * o[1][2] + v[2][2] * o[2][2] + v[2][3] * o[3][2];
    r[2][3] = v[2][0] * o[0][3] + v[2][1] * o[1][3] + v[2][2] * o[2][3] + v[2][3] * o[3][3];

    r[3][0] = v[3][0] * o[0][0] + v[3][1] * o[1][0] + v[3][2] * o[2][0] + v[3][3] * o[3][0];
    r[3][1] = v[3][0] * o[0][1] + v[3][1] * o[1][1] + v[3][2] * o[2][1] + v[3][3] * o[3][1];
    r[3][2] = v[3][0] * o[0][2] + v[3][1] * o[1][2] + v[3][2] * o[2][2] + v[3][3] * o[3][2];
    r[3][3] = v[3][0] * o[0][3] + v[3][1] * o[1][3] + v[3][2] * o[2][3] + v[3][3] * o[3][3];

    return r;
}

…yeah. Mind you, I keep a looped version handy as well, because I’m so stupidly indecisive.

@Monster :
i mean in that precise piece of code… not in general…
the speed is assolutely secondary

all code must be simple and clear , if is a bit more long well if is clear,
if is short and clear better,if is short clear and fast better again

and only for this i not agree with the first solution.
is nothing simple , but complex

I m really fast with CTRL+C CTRL+V in effect :smiley:
then copy and duplicate some line is much easy
overall if this is all right in column

a code have to change continuosly , to get a perfect form :smiley:
if i see which a definitiuon can be more clear i put surely a definition.
but in this case not seem really the best solution
indeed , i know which agoose write very well when him want , only the example is not well choose.

:wink:

Because it is like comparing apples with oranges. Both are fruits, but different fruits not different fruit sorts.


x, y, z = object.worldPosition
x += 1.0
object.worldPosition = [x, y, z]

This code creates 3 variables and a list. The list is converted to a vector which replaces the worldPosition.

object.worldPosition[0] +=1

This code is assigning a new value to part of the existing vector of worldPosition.

You should know that this code does not even work as expected until 2.5 because worldPosition returned a copy rather than a reference. The first option works with 2.49 too.

I think this was more a bad chosen example rather than a style issue ;).

As Marco wrote C&P is very easy and fast, but most of the time you will change code rather than create new one. If you replicate or use a function is a matter of the authors style. Luckily there is nothing that forces us to use one solution only

BTW. Raiderium, your example is a good example of hard to read code. I mean it looks like a picture not like code. I know this is what we learn in school how matrix multiplication works. But you really have to know that. If there is a misspelled index it is hard to see.

This reminds me that I saw code (in C) that consists of names with the same letters forming an image. The code showed exactly this image rotating on the screen. But the code itself was nearly impossible to read (code discombobulate). So it is more a fun code ;).