Designing Multiplayer Flight for Librelancer

One of the things that has kept the community around Freelancer going for as long as it has is the ability for players to explore the open world of Sirius with their friends in multiplayer. Replicating this functionality is crucial to having a faithful recreation of the engine.

The original multiplayer code for Freelancer functions in a way that can be called client-authorative. This means that the client (the game running on your PC) tells the server what has happened, and the server accepts these changes as is. While this avoids any issues with latency for the player making those changes, it leaves the door wide open for cheating, as well as bugs arising from the server taking information from several sources of truth. These can range from jitter visible to other players when one player’s internet is spotty and dropping updates, to more hilarious (or frustrating) moments of killing everyone in a tradelane by pressing Alt-Tab and making your ship glitch out.

To keep it so that one misbehaving client can’t potentially wreck the fun of others, the next tagged release of Librelancer will be using a server-authorative model. Here the client sends just the inputs from the player (steering, throttle, cruise, thrust etc.) and the server runs a simulation of its own to decide where the player actually is. It then sends this information back to the client, which will then update its view to match. For a more in-depth explanation on some of these concepts, I recommend reading this article on State Synchronisation by Glenn Fielder at GafferOnGames.

When the client receives simulated state back from the server, it then has to make sure what the player sees is accurate to what is happening according to the server’s view of the world. This happens by finding the difference, resimulating if the states are too different, and then smoothing out the differences across a few frames. As physics engines are not entirely deterministic, these resimulations are almost guaranteed to happen a few times in any session.

Determining Error

When determining when the client needs to synchronise with the server again, Librelancer takes two measures of error. The first being the position error (in metres), and the second being the orientation error. The position error is simply the distance between the point the client simulated, and the point the server simulated. The orientation error is determined by 1 minus the dot product of the two quaternions. When the quaternions are the same, the error is equal to 0, when they’re opposite rotations the error grows to one.

public static float QuatError(Quaternion a, Quaternion b)
{
    if (a.W < 0) a = -a;
    if (b.W < 0) b = -b;
    var errorQuat = 1 - Quaternion.Dot(a, b);
    return errorQuat < float.Epsilon ? 0 : errorQuat;
}

Note here we flip the signs of a or b if the W component is negative. Flipping a sign of the quaternion still represents the same rotation, but the signs have to be equal for the dot product to give a value we can use.

Librelancer will then perform resimulation if the position is out by a distance of 0.1 metres, or the measured error of the quaternion is bigger than 0.1. For other games where the player moves faster or slower these values will most likely need to be different to avoid obvious popping or glaring differences in what the player and the server see, but these values seem to keep things smooth on both sides.

Resimulating

When Librelancer sends it’s input to the server, it assigns that collection of inputs a number representing the time it was taken. It then stores these inputs in a buffer big enough to hold the past few seconds. When the server sends the results back, enough time will have passed for the position sent back to be in the past, held somewhere in this buffer. When it’s determined the difference is too big, we set the player’s position to be the one the server has sent. We then go through the buffer, replaying all the inputs that we have captured in order to get a new simulated position that is more accurate to what the server will see in a few ticks time. We also set the NPCs and other players’ positions at the same time, so the local player will see them a little bit in the past. As we are running a full simulation on the client however, it’s important to start from a state with everything set up so we can attempt to simulate collisions. When one of these resimulations happen, you can find a line like this in the Librelancer log:

[Info] Client: Applying correction at tick 887. Errors (0.1094342,0)

The two numbers in the brackets are the position error, and the orientation error. Once these corrections are finished, the client smooths out the difference over a few frames so for most cases the player is unlikely to notice the simulations ever diverged!

To make sure this was working properly, I then decided to bring latency into the equation. All this work was done on just my local computer, which is hardly representative of the conditions of anyone’s internet connection.

Testing for Latency

While the transport library Librelancer uses has options to simulate behaviour with latency and packet loss, I decided to use Linux’s traffic control facilities with the tc command in order to keep the latency out of the control of the running code. The particular command I used is as follows:

tc qdisc add dev lo root netem delay 100ms 20ms distribution normal loss 0.3% 25% duplicate 1% corrupt 0.1% reorder 25% 50%

(Note: root permissions are required for these commands).

This adds 100ms +/- 20ms worth of latency to the loopback interface, (affecting 127.0.0.1 or localhost). It also adds a fairly high amount of loss, corruption and reordering of packets. As LiteNetLib is configured in Librelancer to silently drop re-ordered packets, this resulted in a simulated packet loss of between 25-50%. Under this the client was fairly stuttery with resyncing to server state, as the server could not recover all client inputs. However as a worst case scenario this was acceptable to me, and it gives enough playability to be able to fly yourself back to a base, log out and diagnose your internet connection. To remove these conditions from my localhost I then ran:

tc qdisc del dev lo root

The main downside of testing with this kind of latency simulation is a lot of programs now use the local loopback interface to communicate with themselves across process boundaries. Running these commands slowed Rider down to a crawl for me, and limited my use of the machine to just testing until I removed the simulated latency.

Doing this uncovered a lot of bugs in the first implementation, subtle and not-so-subtle. More of these will surely be uncovered by testers in the near future. Needless to say this code is not optimal for performance currently either, as re-doing the player physics is a fairly expensive operation.

However, we now have:

The beginnings of proper multiplayer in Librelancer!

As always, keep an eye on the GitHub page for updates on how the project is going as we work towards beta.

~ Callum