My network bandwidth optimization logic explained for anyone creating an opened world game

My current project is a space MMO that is opened world. As you can imagine, there are tens of thousands of networked moving objects in any scene at any time. In order to allow for thousands of players so seamlessly interact in space under low bandwidth requirements without visual latency or jittery movement I had some interesting problems to solve when it comes to network optimization. This post outlines how I solved them in hopes that it will help someone who will face similar challenges in the future.

I will apologize in advance for the length of this post.

The Basic Networking Requirements:

  1. Physics and movement must be done on the server.
  2. Even though there are tens of thousands of constantly moving objects in any one space scene, bandwidth should be kept to a minimum and scale well.
  3. All AI must be run on the server (Player AI and NPC AI)
  4. Server must be scalable.
  5. Player AI must continue when players log off.

Description of solution:

The server back end consists of a single master / multiple slave environment. Each slave server connects to the master server. All clients also connect to the master server. The master server takes care of the following tasks:

  1. Keep track of the translation and rotation of every spatial in the scene.
  2. Forward world translation and rotation information of all modified spatials to all connected clients (this includes slave servers)
  3. Receive and process movement and game commands from all connected clients.
  4. Save the world state upon shutdown and reload it on startup of the server.
  5. Forward non movement based commands to the slave servers for processing.

The slave servers are responsible for physics and AI execution for the spatials they have been assigned to by the master server. In effect the slave servers are playing the role that an authoritative client typically plays when you create a client/server environment where the clients are trusted to run their own physics. Slave server ticks are synchronized with the master server and slave server scene graphs are synched with the master server’s scene graph every tick. Even if the user logs off, since the user’s player AI is running on one of many slave servers, it will continue to run.

Now for the bandwidth optimization. In order to deal with the z-fighting on the client due to the fact that my game is a true scale space game, the client has 4 overlapping viewports each set at different near/far frustums. When the client connects to the master server, it authenticates and then sends the near and far frustum for each of it’s viewports. The server saves this information and uses it to determine the dynamic refresh interval it will use for each item in the scene graph. The client also sends updates to the server to report it’s camera position 30 times a second. The server is configured with a max and min refresh rate. That is to say, it will send location and rotation changes to the clients within the pre-configured range. The actual location and rotation update tick rate to each client is dynamically adjusted for each spatial in the scene. How does this “magic” work? I will explain the sequence below.

For the sake of simplicity, let’s assume that the configured server’s max update tick rate is 30 (30 times a second) and it’s minimum update tick rate is 1 (one time a second). Let’s also assume that the connected client has 4 viewports defined (basically 4 difrent ranges of near/far frustums)

The server creates a delta snapshot of the world scene graph 30 times a second (Delta because it only includes spatials that where either moved or rotated between this tick and the previous one). This snapshot is sent to a queue.

A central thread reads off of the snapshot queue and forwards a snapshot to each client processing thread.

A client processing thread reads the next snapshot from it’s queue, and iterates through each spatial.

For each spatial, the thread calculates the spatial’s distance from the client camera (the client camera position is sent by the client 30 times a second to the server remember?) … Based on the spatial’s distance from the client camera, the thread knows with client viewport this spatial will be displayed in. The thread has already created a “send” queue for each viewport. All it has to do is forward the spatial coordinated to the appropriate sent queue based on which viewport this spatial has been assigned to. The final network client thread’s job is very easy …

Remember the 4 viewports that the client told the server that it has? … and the server’s pre configured min and max update tick rate? … well since in this example the client has 4 viewports ranging from closest to farthest, and since we know that closer object need faster refreshes than far away objects, the network client thread can now process each viewport’s queue and determin if it should send the update over the wire or not.

How does it do this magic? … well,

Let’s take a look at processing the nearest viewport queue. We know that the nearest viewport needs the highest speed. So for this example, every entry in the queue will be sent over the wire.

Now let’s take a look at the other extream, the furthest viewport. In this case, we want to transmit the slowest. Since we configured the server with a minimum server update tick rate of 1 per second, and we know that the queue is being filled at 30 times a second(max update tick rate), we somehow need to make sure that we only send updates over the wire once a second for each spatial. We do this by keeping a hashtable of last sent time values for each spatial in memory.

The logic is as follows…

Pull the next item off of the viewport queue.
Pull the last update time value from the hashtable for this spatial id.
Compare the queued item’s time with the hashtable time.
Did a second pass?
If so, send the item over the wire and update the hashtable’s time value with the queued item’s time value.
If a second did not pass, delete the queued item and process the next one.

As a final step we also implement interpolation on the client for smooth movement even at the farthest viewport.

I hope this post did not put everyone to sleep and actually shed some light in how someone can create a scalable network layer that self optimizes it’s bandwidth usage without impeding the client’s game experience.