Reposting the content here so that you don’t have to bother reading the thread elsewhere…
I was able to test full networked physics in a loop-back mode. All of the major components are now in place and it's a matter of cleaning up and fleshing out the additional features a bit. Overall, if the protocol doesn't have bugs it's a much tighter networking scheme than I'd even originally imagined.
The next step is to add some fake “connection quality” settings and a simple UI to control them. This will allow me to test the protocol under conditions like higher latency, 50% packet loss, etc… and vary these settings while the app runs. That will be the final test of the protocol. I know it works right when no messages are lost so now I need to test it with lost messages. It’s designed to handle this but the proof is in the pudding, as they say.
To sort of understand what all of this means, I will try to walk through what happens when the physics engine moves an object.
Physics engine loops through all objects and adjusts velocity, rotation, position, etc. based on the current state of the objects. New objects that are contacted are “woken up” and objects that have been still for some period of time are “put to sleep”. The physics engine only considers objects that are awake.
The physics engine then notifies the zone manager about all of the objects that have moved or been removed (put to sleep).
The zone manager figures out which zones the objects are overlapping and then delivers updates to those zones.
Zones collect these updates into a buffer and at the end of a “frame” moves them into a buffered history that holds some number of iterations. The physics engine operates at a high rate, like 60 frames per second but state is only sent 20 times a second or so. That’s why these two things (the state generation and the state collection) are decoupled.
A separate “collector” thread, goes through all of the zones and empties their history… which generally includes three frames of “movement”. This history is then delivered to the zone listeners.
Zone listeners in this case represent the player. Each player may be watching a different set of zones, usually the zone they are in and the immediately surrounding zones (9 zones total). The collector thread delivers the 3-frame history of physics state to each player’s listener if they are watching that zone. (Note: remember that an object can be in multiple zones at once if it overlaps.)
These player-level zone listeners pack up the state for each object, disambiguating multiple zones… so if they get multiple updates for the same object from different zones then it picks one.
These listeners also keep track of which network messages that it knows the client has already received. It does this so that it can avoid sending redundant information. If the position, rotation, parent, etc. hasn’t changed for some object then a very minimal data block is sent. Still, the biggest general data block is about 14 bytes per object. Smallest is about 3 bytes.
8 ) All data blocks for a frame are packed into a message, and may be split across multiple messages if the frame exceeds the configured MTU. If multiple frames will fit in the same message then they are packed together.
[7 and 8 are the hardest part of the whole protocol and represent a solid two weeks of my life, I think.]
In the prototype, I package these data blocks up into an actual read-to-go network message but instead of sending it over the network, I pass it off to code that is pretending to be a network connection.
A network message handler gets the message and unpacks it. This handler represents the real “client”.
The client sends an ACK message back to the “Server” saying it received the message.
The current object delta-baseline is updated to reflect the messages that the server knows that the client has received (ACK-ACK). Every message from the server to the client includes a list of the ACKs that it’s gotten from the client. In other words, it’s sort of acknowledging the acknowledges. It’s mind-warping but necessary. The client needs to know which baseline they agree on.
The deltas from the message are applied to the current objects on the client as part of a sliding window history buffer.
The code that handles the visuals has been looping at FPS this whole time interpolating object position between the last known value and the next known value… and now it has new values to interpolate between.
Someday maybe I will make a diagram.
The prototype started out by just strapping the object position directly to the physics engine. The next tiny milestone was decoupling the frame timings as described in step 14. Then I implemented zone management and everything looped through that, basically adding a zone listener that talked directly to the buffers mentioned in step 14. (basically step 5 talked directly to step 14)
…and the most recent huge milestone, of actually looping through the network layer is a pretty big deal as that added all of the other steps.
Sending the messages over an actual network connection is only a formality at this point. The only thing it will add to the test is unreliability… and I will be testing that in the prototype first.
Running a single process to test is much easier than launching a server, waiting for it to load, and then launching a client… every time I want to test. Also, I can more readily test specific scenarios as I find issues.
As long as I stay under MTU, the protocol as designed should be able to cope with 50% packet loss… but we’ll see. 50% packet loss when you exceed MTU can mean that you miss almost all of your messages if you are unlucky. Under MTU means you only miss some of your state… but since state is mostly redundant then it recovers. Also, exceeding MTU will increase latency since the final UDP message will be as slow as its slowest part.
I can send about 100 object state updates per “frame” and still fit under the standard MTU of 1500. Normally, I’d pack three frames in a message if I can fit them but at 100 objects I’d have to do a message per frame. The protocol also supports splitting frames. The worst case for me is when there are more than 100 objects moving in a user’s zone view and a consistent 50% packet loss that causes the second part of a split to consistently never arrive. It seems unlikely that a UDP channel would predictably lose every other message, though. Also unlikely to have 100 zone-local objects fully moving every frame.
5-10% packet loss overall often means there can be several seconds of 50% or more packet loss. Especially over wireless routers.
Anyway, the point would be to figure out where networking performance is so bad that playability degrades. I think latency will be more critical than dropped packets… which is good because in previous testing even low-latency connections can have periods of relatively high packet loss.
The interesting thing is that because of the double-ack parts I added to the protocol, not only can I accurately detect how many dropped packets there are but I could also detect when they are late, when it is consistently every second or third packet, etc… if I wanted to go that far. Hopefully I can provide feed back to users when their connection is too bad to play properly, based on this testing.
As an aside: if anyone is interested in my BitInputStream and BitOutputStream then I can post them to contrib or something. They are well tested at this point. It allows writing any size bit string (up to 64) to a stream and packs it all together as it goes. So you can write 1 bit, 1 bit, then 6 bits and get a byte, etc…
Yes, please share what you are able to share! This whole networking stuff sounds so interesting I never thought that there is so much complexity involved in “simply” sending (small, maybe delta) state updates to some clients…
@cmur2 that’s one of the reasons so many networking things go so badly wrong. People think its “simple” but it really really really isn’t.
Think about threading and how much complexity that can cause…now realise that networking is essentially threading over multiple machines with (effectively random) delays on every communication, no real synchronization functionality built in, and the choice of TCP (slower but at least you get everything eventually) or UDP (faster but some messages randomly just don’t get through)… And in the case of multi-player machines you are often trying to do this across not just 1 or 2 but 5, 10, 30, 60 or in the case of an MMO thousands of players.
Just added some support for adjusting the up and down channel packet loss rates. I can move a slider from 0 to 100% loss and so packets are randomly dropped within the threshold. This is good because sometimes Math.random() is bursty and you’ll lose a whole bunch of packets in a row… just like in a real high-drop connection.
At 50% drop rate with a minimal amount of objects moving, the simulation is still playable. There is some jerkiness when there is a stream of drops but for the most part the protocol handles it. I will note that there is no latency built in yet… that will make things much worse because the latency will eat up most of the view-delay window that currently allows non-visible recovery for a few lost packets.
At 15% drop rate, it’s hardly noticeable at all.
At 75% drop rate, you can still play but it’s very frustrating as there is a lot of teleporting. I certainly wouldn’t want to do any combat that way. It will be worse with latency.
After I add simulated latency adjustment (and requisite view delay sensing) then I may post a video. The test app is a nice top down view of a bunch of rollable spheres and so really shows off issues when they occur.
@cmur2, normal TCP networking can be pretty hard on its own. As zarch mentions, it’s similar to multi-threading except now there is a random delay in your interprocess communication. UDP can be a whole order of magnitude more complicated depending on what you want to achieve. If all you have is an FPS with a few players then it’s possible you could fit all of world state in a single message and still fall under typical MTU.
In my case, I have to simulate the physics of a basically endless world with any number of physical objects (that the player created) doing things. Zone management, delta-tracking, message compression, etc. are all necessary to allow as many players to play as possible. Each of these adds another order of magnitude of complexity… and I’m not even trying to make a twitch game like an FPS… that would be a whole other level of complexity.
Really interessting post! I never had to implement a realtime networking protocol yet.
It souds like you are using your own physics engine, right? Or do you somehow use (J)bullet? I’m still in the considering phase if my game really needs multiplayer or not But I’m currently using the bullet physics engine because I didn’t see the need for building it myself - until now at least …