In developing a networked game, sometimes one needs to be able to test features running over specific network conditions. How does the game hold up under high latency and/or packet loss? What about in cases of varying network jitter, where the latency is ever-changing? Answering these questions is crucial to testing network code and making sure that it will perform well against whatever chaos the Internet might throw at you.
As a programmer relatively new to networking, my first thought was to build a network simulation that could be directly integrated into our budding engine. For example, a quick perusal of Valve’s developer wiki shows built-in net_fakelag, net_fakejitter, and net_fakeloss variables for controlling artificial lag, jitter, and packet loss, respectively. Being able to configure these parameters within the engine certainly has its advantages, but rolling your own simulation probably requires a good bit of time and effort, and nobody likes reinventing wheels.
For anyone looking to get something up and running quickly, a better option is to use an external tool to alter traffic to and from your game without modifying the code. I’ve found Dummynet to be pretty useful for my own purposes, and I’ll be discussing some examples of how it might be used. It comes preinstalled on FreeBSD and Mac OS X, and is available for download on other systems.
Dummynet is controlled mostly through ipfw, a command-line utility for firewall configuration. The ipfw tool itself is very complex, and I do not claim to have any real degree of mastery over it, but a few simple commands and a bit of scripting is all we really need to get a decent simulation. Be warned, however, that messing with the firewall can lead to accidental lack of Internet, so you might want to make sure you have a comfortable level of documentation at hand before doing anything hasty.
The ipfw tool processes packets according to a configurable list of numbered rules called a ruleset. We can start by viewing the current ruleset using the list command.
Note that your output may differ based on your firewall configuration. We can see that by default the ruleset contains a mandatory rule numbered 65535, which in this case allows packets matching any IP protocol from any source address to any destination. Packets are processed by comparing them to each rule in ascending order by number and executing the action of the first rule for which there is a match.
We can control traffic by adding rules that match specific packet data, and then send the packets through a Dummynet pipe with added delay or packet loss. For example, we could add a 100 ms delay to all traffic heading to xkcd.com like so:
The first command here configures a Dummynet pipe with id 1, creating the pipe if it did not already exist. The second command creates a rule numbered 100, which matches packets with the desired destination address and routes them through the pipe. Note that any rule that attempts to pass packets through a non-existant pipe will block traffic and throw an error, so do not forget to configure pipes before using them. Using list again allows us to view the updated ruleset.
Pinging xkcd.com should show a round-trip time of 100 ms plus whatever the actual current latency is across the connection.
In most cases, we would want to simulate a full duplex connection by adding a corresponding rule for packets coming from the remote host, so that the delay is applied in both directions. When we are done playing with our new rule, we can delete it and the pipe as follows.
Simulating on the Same Machine
One nice thing about using Dummynet is that we can actually simulate conditions on connections between processes running on the same machine. For example, we could create a rule matching packets traveling over the loopback interface like so:
As stated in the documentation, this has its share of caveats. Pinging our localhost will show the most obvious problem. Because the ruleset is applied to both incoming and outgoing packets, using the loopback causes the packets to be processed four times, resulting in a round-trip latency of approximately four times the delay value. A simple fix is to instead use two separate rules that will match packets for each direction. The following example adds latency to only the incoming packets.
The result is a round-trip latency of twice the delay, which is what we expect in a full duplex connection. Note that this solution might not overcome other issues caused by the repeat processing of packets.
Adding a constant, consistent latency to our simulated connection is useful in some cases, but most real-world connections are certainly not so well behaved. Unfortunately, Dummynet does not seem to provide a way to create a pipe with varying latency. Instead, we can do a little work to set up a program that generates delay by a method of our choosing and reconfigures the pipe over time. As a toy example, I threw together the following bash script, which changes the pipe delay on a set interval, each time choosing a random value between 25 and 50 milliseconds. The script also pings the localhost and logs the output to a file.
Not surprisingly, I was able to find an existing research publication by Grenville Armitage and Lawrence Stewart that investigates this exact method and makes some interesting observations. After running trials under similar conditions, I find that my own results closely match theirs.
Data and Conclusions
The above graph shows the number of pings recorded for each round-trip latency value for two different trials. In the first trial, the delay was reconfigured every 100 ms, while in the second it was reconfigured every 500 ms. From the data, one can see that the distribution of delay is considerably less uniform for the 100 ms interval, which has a higher concentration around the mean ping time of 75 ms.
What causes the distribution to be skewed? Consider a packet being sent across a connection from A to B and back to A. At constant latency, each leg of the trip incurs a 100 ms delay, for a total round-trip time of 200 ms. Now suppose instead that after the packet has reached B, the delay increases to 200 ms, or 400 ms round-trip. Then by the time this packet returns to A, its total travel time is 100 ms + 200 ms = 300 ms, which is the average of the old and new round-trip times.
As Armitage and Stewart point out, changing the pipe delay causes any packets currently being processed by the pipe to undergo a similar effect. If the pipe is reconfigured at a higher frequency, then naturally a larger percentage of the total packets will be affected, skewing the data more severely.
Whether or not the non-uniform distribution is a good or bad approximation of an actual network connection is another issue entirely. Logically one could expect that, for any real connection, changing latency at such a high frequency would incur the exact same effect, with mid-flight packets experiencing differences in inbound and outbound latency. More importantly, would a truly uniform distribution serve as a more accurate approximation? What kinds of distributions are actually the most useful for testing? For me at least, this is a topic for further investigation, and perhaps, for another post.