[SOLVED] SpiderMonkey crash on first message from server

Could you show us how and where you register your classes for serialization, please?

There is an unresolved bug in Spider Monkey where a message can be received as part of the same buffer as the serialization registration message. If that happens, it can’t be “unmarshalled” properly because the serialization message hasn’t been processed yet.

Anyway, it’s very weird to use server.broadcast() for these kinds of messages because you may have clients still logging in and stuff. I recommend managing the game sessions a little better and only broadcast to the clients that have actually joined the game. You will be happier in the long run. (I actually almost wish I’d never added server.broadcast() to the API some days…)

The problem is that I have one game instance that hosts but is immediately in-game. So the server needs to be running, even when other game instances are still joining.

So, to “avoid” that bug one has just to wait until the serialization registering process is finished, or not?

Try the following please:

  1. Define a CountDownLatch object in your client class.

  2. Initialize it before you connect to the server (:

    public boolean connect(String address, int port, Application app, GameCommander gameCommander) {
    startedSignal = new CountDownLatch(1);
     try {
     	this.client = Network.connectToServer(address, port);
     	this.client.addClientStateListener(this);
    
               // + some lines of my class (not relevant here)
               // register your message listener here...
    
     	this.client.start();
     	this.startedSignal.await();    		
            return true;
     } catch (IOException | InterruptedException e) {
     	e.printStackTrace();
     }
     return false;
    }
    
  3. Your clientConnected() method should look like this:

    @Override
    public void clientConnected(Client c) {
        startedSignal.countDown();
    }
    
  4. It then waits one second…

Please try this and tell us if it worked.

Yes, just wait for the message to be handled… but it can be hard to tell from the server’s perspective.

“but is immediately in-game” This is not a requirement for anything else you’ve said… it’s just the way you’ve done it. Which means you have no flexibility ever again to do anything differently. I’m saying that’s a bad design… regardless of this bug.

Someday you decide you want to have a lobby. Well, you’re screwed. Someday you decide you want the players to register their names first… well, again you’re screwed.

Don’t use server.broadcast(). Do use a specific game session that once you know your connections are ready to be “in game” you add them to the list of connections to send state to.

Regardless of this design it’s a 1000x better design… even if you don’t do anything but add them to that list right away when they send the first message.

Now I’m 100% sure I regret putting server.broadcast() into the API. I could have avoided this discussion completely.

Elaborating on this, it’s kind of a rare confluence of events that causes it anyway. There are so many things that could cause the messages to come separately and then things are fine.

It happens when the serialization message goes out and another message goes out… and all the way through the TCP connection they stay in the same packet and are received as part of the same buffer read on the client so that they are packed into the same byte array.

It’s a somewhat unfortunate part of the message processing design that the entire byte array is unpacked into messages at once… because it means if a message in the first part of the array defines new serialization registrations that its too late for the ones in the rest of the message. array in, messages out… then messages are handled.

Sending serialization messages was not a consideration when this stuff was designed and I was doing my best to keep the existing serialization classes (a big mistake, actually, in the end). Unfortunately, the design is not so easy to fix, either. A message can have the other problem and be split across several reads… untwisting the code that pushes buffers in and gets messages out is not as simple as I would hope.

There are ways around this issue with their own trade offs:

  1. just wait a bit before sending messages to clients after they first connect. Means you can’t use server.broadcast() (which as mentioned is kind of a mistake to have included in the API in the first place). It might also be possible to just wait until the client sends you a message before chattering back. Only real down side here is that it feels a bit hacky. (If the network is horribly backed up it’s still technically possible that messages might buffer together.)

  2. don’t use automatic serializer registration and go back to the “old way” of having to manually register everything. There is a pretty huge downside here as it means it’s very difficult to use the service model since the thing registering the services (the game) would need to know all of the classes to register and do it exactly the same on both client and server. I added the serializer registration message stuff specifically to handle this problem of being able to easily register modular add-on services.

  3. send a message so big that it’s guaranteed to exceed the packet size and by split. Down side here is that you have to send one big empty message and it’s ugly. It would work consistently, though.

@pspeed Could you give me a pointer on how to disable the new serializer service and do it “the old way” (it shouldn’t be too hard with a shared static method that registers the classes, right?)?
And I’ll consider doing a lobby system where the server is started and the clients connect first before entering the game. Will take a bit of time to design cleanly, but I think you’re right and it’s a good idea to do this.

Grab the serializer registration service on the server and remove or disable it.

Well, yes… if you never want to use any of the add-on services like RPC, RMI, etc… basically, the service model is nearly impossible to support with the “always register everything manually and in exactly the same order” “old way”.

1 Like

Note… you could just take advantage of the service model to do this. Have the “game session service” that adds clients to a list when it receives a message from them. Then use that list to send out state messages.

You can see examples of this in the sim-ethereal examples:

I guess it might be hard to filter out the RPC/RMI stuff.

The basic approach, extend AbstractHostedConnectionService.
Set autohost to false (presuming you even want to use the start/stop hosting methods)
Override onInitialize() to add a message listener.
The message listener calls startHostingOnConnection() when the proper message is received.
Override startHostingOnConnection() to add the connection to a list.

In your other code that sends state, it would look up this service and and grab its list (or just call a broadcast state method or whatever).

You don’t even need a client service or anything if you just want to send raw messages. Though having one to wrap all of that is usually nice… but that’s a slippery slope into hooking up RPC/RMI also (which is convenient but a little confusing if you don’t already know what’s going on).

Anyway, that sets you up to elaborate into something more complicated later.

2 Likes

Removing the service and doing it “the old way” did the trick.
Although I think with as little work as I have done up to now, I’ll definetely check out your idea. Thanks for the long reply!

Also, there is this:

Which is basically a boot strap for a network game that includes no game at all… just the basic UIs and connection setup stuff.

Edit: though I guess it also doesn’t setup any game session stuff… but it is the foundation for everything else.

I’ve been redesigning my network layer with jME’s RMI service and ran into this issue again. I’ve found a solution that seems to be working well, so I thought I’d post it here for anyone who runs into this again.

As soon as the client connects to the server, the server exposes two interface implementations to the client - a heartbeat and a login handler. It does this from the ConnectionListener.connectionAdded method, but 9 times out of 10 this resulted in the RMI object share message ending up in the same outgoing buffer as the serializer registrations, triggering the error. Client-side I have a similar setup - the client’s connectionAdded method gets the server’s remote login object and initiates the login once everything else is in place.

The solution is fairly simple - serverside, instead of sharing the objects directly from the connectionAdded() method I fire a Runnable that does the sharing to a threadpool executor. This has consistently resulted in the first buffer being flushed before the object share messages get sent out.

Clientside, this introduces a delay from when the connection listener fires to when the login object reaches the client. You can’t use a spin-wait polling method from the ConnectionListener because that blocks the network thread and the incoming share message is never processed, so instead of initiating the login directly the ConnectionListener implementation fires a Runnable to the client’s thread pool that spin-waits for the server’s login object to become available.

This technique has been working well so far. I’ve had no errors since I implemented it, and unlike some of the other wait-to-send methods it doesn’t introduce arbitrary latencies.

Edit: Looks like I spoke a little too soon - once the server’s warmed up the error crops back up despite the workaround.

Edit2: Scheduling the server’s sharing task with a 10 ms delay seems to be enough.

1 Like

@pspeed resolved this issue in his sim-eth examples. I think he did this by adding a dummy ‘service’ that just waits for a small delay when a connection is added. I’m on my phone now, i’ll link the example later.

Edit: Examples/GameServer.java at master · Simsilica/Examples · GitHub

He adds a delayservice as a first service to the services list.

I do plan to (eventually) fix this in spider monkey itself… which is why the service is a nice solution. Easy to remove later.

The only reason I haven’t is because it’s in a critically core section of SpiderMonkey and very tricky to test properly. I know what needs to happen but I have to do it very carefully.

3 Likes

@pspeed would you be interested in working on this for jme4? No time rush as jme4 will be a long time out until I can get an abstraction layer in place to start working on vulkan.

I will fix it for jme3. I haven’t even seen an architecture proposal or prototype or anything for jme4 so it’s probably farther off than I planned to fix this issue.

2 Likes

In 3.3 this should be fixed now.

In the end, the fix was pretty simple. Some industrious person could probably even make a similar fix directly to MessageProtocol in JME 3.2 if they were so inclined. (Basically, just make the internal queue based on ByteBuffer instead of Message.)

4 Likes

It might be a noob question. I tried a simple test by modifying network example.
I run client and server like this

        System.out.println("Starting chat server...");    
        ChatServer chatServer = new ChatServer();
 
        System.out.println("Waiting for connections on port:" + ChatServer.PORT);
 
        // Now launch a client

        ChatClient test = new ChatClient("localhost");

I got something like

Starting chat server...
Waiting for connections on port:5110
Dec 06, 2021 1:01:36 PM com.jme3.network.service.serializer.ClientSerializerRegistrationsService onInitialize
INFO: Skipping registration of SerializerRegistrationsMessage.
Sending:mygame.HelloMessage@52e677af
Started client:com.jme3.network.base.DefaultClient@341b80b2
Broadcasting:mygame.HelloMessage@b037416  reliable:true
connectionAdded(Connection[ id=0, reliable=NioEndpoint[1, java.nio.channels.SocketChannel[connected local=/127.0.0.1:5110 remote=/127.0.0.1:57586]], fast=UdpEndpoint[1, /127.0.0.1:63263] ])
Dec 06, 2021 1:01:36 PM com.jme3.network.message.SerializerRegistrationsMessage registerAll
INFO: Skipping registration as registry is locked, presumably by a local server process.
Client #0 received: 'Hello World!'
clientConnected(com.jme3.network.base.DefaultClient@341b80b2)

then I run another client like

        // Now launch a client

        ChatClient test = new ChatClient("localhost");

The message not sent.

Sending:mygame.HelloMessage@7a1ebcd8
Exception in thread "main" java.lang.IllegalArgumentException: Class has not been registered:class mygame.HelloMessage

I did the client something like

        client = Network.connectToServer(NAME, VERSION,
                host, PORT, UDP_PORT);

		client.addMessageListener(new ChatHandler(), HelloMessage.class);

	
        client.addClientStateListener(new ChatClientStateListener());
        client.addErrorListener(new ChatErrorListener());
	
		
        client.start();
 
 		//hello test
		HelloMessage hello = new HelloMessage("Hello World!");
        hello.setReliable(true);
        System.out.println("Sending:" + hello);
        client.send(hello);	
		
        System.out.println("Started client:" + client); 

What did I do wrong so HelloMessage not registered and sent for the first process and not for the second client?

That you did wrong. You need to register all classes you are sending. Some of the basic Java classes are registered for you. Like so:
Serializer.registerClass(HelloMessage.class, new FieldSerializer());