Authoritative Server Architecture Prototype

Comments welcome.

Multiplayer is something that has always interested me.
For those of you who understand at least a little about multiplayer, you will understand the advantages and limitations of a client - side game state. It is this which my networking components / addon utilises. It allows for anyone to create a mutliplayer game but it is limited in the sense that it can require complicated logic in order to allow a dynamic game environment, such as that of a First Person Shooter. For this reason, the multiplayer system in Blender that I have created is nearing completion. This is, in the sense that to give it more features would be defunct because by that point you ought to code your own game dynamic.

So here shall lie my development log for an optimised server-side multiplayer system, for use in both Necrosis, (various other projects whom i am helping) and Matter.

Authoritative Server Game Loop
An authoritative server - client game loop has to consider two factors.

  • Firstly, the server must only receive events from each client. Events describes keyboard and mouse inputs - the user’s interaction with the game. By utilising events, and not game-dependant data, the server can ensure an unmodified game state (within limitations), because it modifies the game state itself.
  • The client depends upon the server for the game state. However, if the connection latency is 100ms, it will take 200 m/s + calculation time for the client to receive the newest game state after sending user events. This means that the client will naturally appear to lag. To circumvent this there are a variety of lag compensation techniques.

You can see in this chart that the server will process the game state every frame. It is likely in my case that parts of the game state, such as physics can simply be simulated within blender, for otherwise it just incurs processing time using Python, which is less than likely to be efficient. As the server updates the game state, it will first check for user input. If this is the case, it will update the events stored on the user’s game state.
The second game-state update will affect the entire game. This update would run every frame, and update the game including the changes incurred through user input. This would process event driven updates as well as server-side AI.



Authoritative Server Responsibility Model
Within an authoritative server - client model, there are certain elements to the game that belong purely to the server (they are only calculated on the server), whilst some are only calculated on the client.

  • Server Side Data
    Typically server-side calculation involves critical data, which must not differ between clients, and the server wishes to prevent from cheating by spoofing.
  • Client Side Data
    Client side data can involve trivial or insignificant details, such as particles, bullet trails, which are not required to be exact for the game to continue - they don’t modify the game dynamic.


Lag Compensation Techniques
Due to the inherent nature of the authoritative server architecture, the client must be able to predict or continue the previous game state for a limited period into the future. This is due to the time difference (or latency) of the connection between Client -> Server -> Client.
Here is a visualisation of this delay:

  • Client-side Extrapolation
    As aforementioned, Client Side Prediction is key to creating the illusion of a low latency connection.
    It allows the client to predict a future game state based upon previous game states - using extrapolation.
    Client Side Extrapolation is useful as it hides the update period between a client sending data, and receiving data for the game state. It can only be used in a limited scope, because as inputs are unreliable from the clients, predictions are rarely accurate.
    Due to this inaccuracy of prediction, the game state on the server and on the client will likely differ. Therefore, to avoid jerk-like transitions between predicted and server game state, the client will determine how close the server game state is to the client prediction, and use it within an error margin - but it will interpolate the client’s prediction towards the server’s game state, to avoid the two states becoming completely desynchronized.
    This is used for entities that aren’t controlled by the client - such as NPCs, and other players. This extrapolation often uses linear interpolation - taking the change in a value, dividing by the time taken for the change to occur, which returns a delta, and then extrapolating that delta multiplying it by the elapsed time since the last packet, and adding to the data of the last packet:


  • Input Prediciton
    As mentioned in Valve’s Article, Input prediction allows the client to simulate (predict) the effects of its inputs before the server confirms their legitimacy. The design of this model means that if a client were to press a key, for example a movement key, they wouldn’t see a response on screen until the server received the input, processed it and replied. This could take a few hundred milliseconds, which to the player is unnerving and causes issues with playing. So, predicting the output of a key before it is confirmed, we can reduce the apparent lack of response.

Because we want to make sure that little damage to the game state is done if a client is cheating, we limit the input prediction to that of the local player’s attributes - movement and potentially shooting.

To predict a gamestate is relatively easy, though it requires moving away from a totally dumb client, and therefore moves away from a centralised logic-based approach. After informing the server of the client’s input events, the client will predict the outcome of the input based upon the client’s current GameState. Typically, this should be acceptable if the client’s game state is relatively new. For example, a doorway is unlikely to move during the course of a game, so a client will most likely be able to move through it, however an enemy with a weapon could shoot the client before it is aware of it, and thus the input events are likely to be rendered ‘invalid’.

When the client receives a game state, it will already be old due to the latency between the server and client. The input events would also be old relative to the server, but the server is able to calculate the inputs based upon the gamestate at the time of transmission - looking back into previous gamestates at the same time stamp as the input packet (assuming the server and client times are synchronised).

In order to determine whether a prediction should be corrected, the client will compare the outcome of the prediction with the server game state update. In this event, one can use time stamps, or input ids - stamp the events sent to the server with an id, which the server appends to the new game state packet. The client checks to see whether the predicted outcome for input X is equal to the server state for input X. This technique would allow for stacking of input predictions - where by two different inputs are sent to the server, and the client predicts upon each valid one, e.g:

This is used for the local player.

  • Rewind Time
    This method I believe is used by Valve’s Source engine. Essentially it moves players and other objects subject to hit-detection back to the point where they were at the time the player issued the command (such as shoot) according to the server game state). Because the Client and Server Game States should be very similar, the server will position the players at the positions that they were at the time of the event for the client, and they should match similar states to that of what the client saw them to be.
    However using this method for the same example; shooting, client and server hit-boxes won’t exactly match because of small precision errors in time measurement. Even a small difference of a few milliseconds can cause an error of several inches (in game) for fast-moving objects.
    But it reduces the effect of latency; as a client’s packet is on its way through the network, the server continues to simulate the world, and the target might have moved to a different position, which may not register a hit which the client saw on its screen.
    Here is a video that documents the advantages and disadvantages

Socket polling

  • Recvfrom can poll the socket buffer and return only one packet.
  • With a frame rate of 60 times per second, the logic ticks at 60 times per second.
  • This means the socket is polled max 60 times per second
  • With more than 5 clients, this means (assuming each client sends a packet one after the other), the max packets processed per second 60 / 5 which is 11.
  • However, the clients may send on the same frame, so they are added to a buffer.
  • This means that the next frame, an old packet is read. - This causes delay, as new packets are temporarily ignored
  • This gets worse with increase in clients, as more data has to be buffered

To overcome this there are two solutions:
1)Set the logic tick rate to a higher figure, executing more logic ticks per second
2)Poll the socket until a socket.error Exception occurs, indicating an empty socket buffer
Solution 2, Poll the socket until a socket.error Exception is arguably a more efficient one than that of Solution 1, because it handles all received data during the frame it was received, and not during successive ticks (afterwards).
It will increase the apparent logic profile, (A case of distributed handling against accurate processing) but this method will provide more accurate data with large numbers of clients.
http://mdah.state.ms.us/arrec/digital_archives/mbfc/graphics/horiz_rule.gif

Updating Game States
Authoritative multiplayer focuses on centralising the game state to the server. It means that there is no room for discrepancy between each client’s calculated game state, and thus Gameplay should be consistent.
Clients have two possible types of game state they can receive and interpret:

  • Full Snapshot
  • Delta Snapshot

Full Snapshot
A Full snapshot is self-explanatory; it sends the client the entire gamestate according to the server. These are sent less frequently than delta snapshots (see below) because they are large in terms of packet size, and they aren’t needed often because the client doesn’t need to know the entire game state, and the less it knows, the more secure it is to cheating.

Delta Snapshot
A Delta snapshot is sent most often. It describes the changes that have occurred on the server since the last acknowledged game state received by the client. This typically involves moved players, killed players and/or NPC positions, orientations and actions. Delta Snapshots are useful only if they implicitly state between which game states they are describing. This would transmit the delta vector between the last acknowledged gamestate received by the client and the current game state. If the server knows enough about the map, it could perform a line of sight test and determine which users the clients need to know about.
http://mdah.state.ms.us/arrec/digital_archives/mbfc/graphics/horiz_rule.gif
Hiding the lag

Clock Synchronisation

Smooth server correction

  • Take the server packets and store them (assume they’re world snapshots for now).
  • When there are two packets stored, use the delta for extrapolation of new position. Before then just use client input prediction. This is the extrapolation of the server state.
  • Use EPIC to push the client towards the current state. (It doesn’t visually snap the player, instead it interpolates toward the destination.
  • Store the above extrapolated states, accessing them by the tick times, (The server and client ticks are synced - see above).
  • When you receive a server update, apply the delta (between the client extrapolation and the update) to all the packets that were simulated after the update, and delete previous packets.
  • This means that when the future server gamestate is received, it “should” be close to the client’s game state.

http://mdah.state.ms.us/arrec/digital_archives/mbfc/graphics/horiz_rule.gif
Input Batching

As i progressed through the design, it became apparent that one could definately not send an input packet per frame. It was expensive; for bandwidth and processing time.
Now, just as the logic has a tic rate (60 / second) the network has a tick rate (which can never by higher than the logic rate) of 10 times per second:
It requires one to batch packets. I am grouping mine into groups of six; storing the packet for six frames, then sending a group of six. This works out a net effect of the inputs on the user’s avatar, then applies it at once.

http://mdah.state.ms.us/arrec/digital_archives/mbfc/graphics/horiz_rule.gif
RLE - Run Length Encoding

It was obvious to me that during the work on input packets, it may be the case that a packet was dropped. If this were the case, the sever would lose the client’s information for ~ 6 ticks. This is 1/10th of a second, and could be greater if the network rate were reduced. In order to prepare the server to deal with dropped input packets, I send two input batches; the previous batch and the current batch. This means that the server can look at the previous batch in the next packet in case of a lost packet. It relies on something called RLE, which I would have implemented regardless had i seen its effectiveness. It can reduce the packet size from 300 bytes to 80 bytes, which equates to 10 MB per hour or something like that. It does require some processing to convert between an RLE and a normal input list, but it shouldn’t be too high.

My GameDev.net support thread.
http://public.gamedev.net/public/style_images/gd_small_logo.png

Current Progress:

Documentation:
http://code.google.com/p/py-auth-server/w/list

Current Code (always updated) 30/09/2012 12:06 GMT:
http://code.google.com/p/py-auth-server/source/browse/#svn%2Ftrunk

Task List:

  • Connect and Disconnect of clients http://www.arcadvisor.com/images/correct_tick.gif
  • Event driven model http://www.arcadvisor.com/images/correct_tick.gif
  • Full Snapshots to all clients http://www.arcadvisor.com/images/correct_tick.gif
  • Delta Snapshots to all clients http://www.arcadvisor.com/images/correct_tick.gif
  • Clock synchronisation between server and clients http://www.arcadvisor.com/images/correct_tick.gif
  • Input batching between network ticks http://www.arcadvisor.com/images/correct_tick.gif
  • Client-side input prediction http://www.arcadvisor.com/images/correct_tick.gif
  • Client-side extrapolation of game state (EPIC) http://www.arcadvisor.com/images/correct_tick.gif
  • Client-side agent ahead of server http://www.arcadvisor.com/images/correct_tick.gif
  • Separated networking tools from game toolshttp://www.arcadvisor.com/images/correct_tick.gif

Attachments


Thought processes:
Most of the work in the core has been hesitantly deemed working.
There are essentially two different functions completed by the networking layer - state recreation and RPC calls.
At present, there is a Hybrid that accounts for both. (or rather, the former by the latter).
We have the existence of “network events”. These events have identifiers that denote them to a type. The direction that they travel down the network determines how the event payload is used. Each event instance has a receive and send method. These methods are invoked for their respective events.
These events are not reliable by default. There are simple ways to implement a sort of reliability layer, as used in clock sync, so it is best to leave this up to the user.

The only aspect to the system that is suboptimal is the following:

Forward game_state storing (the client stores the prediction forward in time, to account for upstream latency - because the server cannot simulate in the past).

In the latest update, I’ve removed the RLE compression from inputs, until the system is working as it should be.
What’s new?

  • Added time-frame dependant raycasts - These will perform a state-lookup when run, and will simulate bounding boxes to determine if a collision occurred (this is part of the avoidance of physics rewinds (or rather a limited form of it)
  • Added single-depth instance tree (no need to lookup relationships with different class instances, all stored on module level).
  • Added clock polling - will return the same tick throughout same game frame.
  • Added customisable event attributes (using type-matching instead of name-matching)
  • Added event-manager to deal with parsing and compiling events to bytes.
  • Modified to use character physics and wrapper for more precise physics
  • Added account for inconsistent update from client (due to framerate mismatch)
  • Added local prediction for UI
  • Added forward prediction on client (to enable prediction checking offset for to_server latency)
  • Added get_state and set_state methods to client (and get_state to server)
  • Added Exceptions module; IncompleteStateError, TickError
  • Added muteable to immutable conversion base class for class inheritance
  • Leveraged state bitmask more fully
  • Fixed bug with transmission rate set to True every tick
  • Fixed extrapolation once more (and a further time!)
  • Added frame tolerance to client-side correction
  • Separated extracting and reading input events (to read_inputs, run_inputs)
  • Modified user movement logic for smoothing and precision correction
  • Mirroring on User class from server and client (all methods nearly identical)
  • Abstracted UserBase on both server and client, renamed EntityBase
  • Added add_object method to GameState class (with position parameter in same index as other)
  • Added AddressBook class to handle connections. Support for Ghost users and entity access from connection
  • Added Entity system; removing dependencies. Similar concept to teams, but accessed by constants. -> Can read entities by type or id
  • Added on_update method to gamestate, called instead of force_update
  • Moved entity_object to EntityBase // EntityDerived class
  • Renamed UserManager to EntityManager, added generic methods
  • Modified TeamManager to use team instances like EntityManager
  • Fixed bug with shooting and reloading on server
  • Added constants module for network constants
  • Moved shared classes to generics module
  • Removed old import statements for unused modules
  • Moved global_frame_time and local_frame_time from defined to automatic property using decorator (slightly more overhead)
  • Added projected_tick attribute to client clock class - client runs ahead of server
  • Fixed a bigger flaw in shooting - cannot shoot whilst reloading. Fixed discrepancy between server and client
  • Removed .active attribute, instead using states
  • Moved all entity and team structure code into entity_manager and team_manager (apart from the inclusive dependencies in the serialiser)
  • Removed data class from entity (useless, moved into EntityBase)
  • Fixed RayCasting on server and client to return correctly hit entity

I’m intending to remove the dependancy on the same entity packet type in the next update

Space reserve in case people comment when they shouldn’t! #5

Space reserve in case people comment when they shouldn’t! #6

Space reserve in case people comment when they shouldn’t! #7 and update!

Space reserve in case people comment when they shouldn’t! #8

This is now open to comments!

Space reserve as more is likely to come! Also, updated again!

How is this going?, whats the Licence?, where i can clone from?, have you checked Qt4 Framework for Networking?
It has some things you describe already done, its used by Google Earth for example, which seems to work.

It is going well. I hoped you’d seen my other reply in the other thread. There is no official license yet - I have little care for people intending to take my work - I am releasing it with the intention for it to be used and improved. I haven’t tried the framework, but I am happy with what I have already accomplished! :slight_smile: