Server is not closing gracefully

Hi,

I’m currently facing an issue when closing a server and reopening it several times.
I have a client that can choose to host or join a game, then it can leave the game it joined. No problem at this point, but when I stress it a bit (hosting, leaving, hosting, leaving, etc), I finally get the error ‘Adress already in use : bind’.

I start my server this way :
[java]
server = Network.createServer(Globals.DEFAULT_PORT_TCP, Globals.DEFAULT_PORT_UDP);
server.addConnectionListener(this);
server.addMessageListener(this);

        server.start();

[/java]

And I shut down my server this way :
[java]
for(HostedConnection conn : server.getConnections()){
conn.close(“Server closing”);
}
server.removeConnectionListener(this);
server.removeMessageListener(this);
server.close();
[/java]

What is strange is that the ‘Already in use’ appears after a <span style=“text-decoration:underline;”>random </span>(usually 3-6, that is to say ‘often’) number of server start / shut down.
When it closes/starts with success, the behavior is exactly what I expected, I can’t notice any side effect.
As far as I tested it, there is no difference betwin a succesful closure and a failure : I close the server each time from the same entry point, from the same thread, and without having apparently changed anything else.

I also noticed that tcp and udp selector threads often stay alive after having closed the server. I tried manually forcing them to interrupt, without any significant result.

Could you please tell me what I am doing the wrong way, or what I forgot to do ?

I started to write a longer response…

but can you tell me specifically which threads are still running?

The kernel threads should be shutting down immediately and the terminate() even waits for the thread to finish before returning. So unless some error happens then they should be shutting down.

Alternately, do you have logging turned off? Just wondering if maybe you are missing some server error messages.

Can you provide the specific error message? ie: is it the TCP port that is not binding or…?

You are right, I turned logging off as nifty is very (very (much more)) verbose, and forgot to turn it on.

The trace I saw when trying to join the server is the following :

com.jme3.network.kernel.KernelException: Error hosting:0.0.0.0/0.0.0.0:6143 at com.jme3.network.kernel.tcp.SelectorKernel.initialize(SelectorKernel.java:98) at com.jme3.network.base.KernelAdapter.initialize(KernelAdapter.java:101) at com.jme3.network.base.DefaultServer.start(DefaultServer.java:158)

And

Caused by: java.net.BindException: Address already in use: bind at sun.nio.ch.Net.bind0(Native Method) at sun.nio.ch.Net.bind(Net.java:344) at sun.nio.ch.Net.bind(Net.java:336) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:199) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67) at com.jme3.network.kernel.tcp.SelectorKernel$SelectorThread.connect(SelectorKernel.java:251) at com.jme3.network.kernel.tcp.SelectorKernel.initialize(SelectorKernel.java:95)

What I didn’t see because of logging turned off is the following (happening when I close the server) :

java.lang.NullPointerException at com.jme3.network.kernel.tcp.SelectorKernel.closeEndpoint(SelectorKernel.java:186) at com.jme3.network.kernel.tcp.NioEndpoint.close(NioEndpoint.java:95) at com.jme3.network.kernel.tcp.NioEndpoint.close(NioEndpoint.java:76) at com.jme3.network.base.DefaultServer$Connection.closeConnection(DefaultServer.java:515) at com.jme3.network.base.DefaultServer.connectionClosed(DefaultServer.java:436) at com.jme3.network.base.KernelAdapter.connectionClosed(KernelAdapter.java:141) at com.jme3.network.base.KernelAdapter.createAndDispatch(KernelAdapter.java:246) at com.jme3.network.base.KernelAdapter.flushEvents(KernelAdapter.java:255) at com.jme3.network.base.KernelAdapter.run(KernelAdapter.java:268) Exception in thread "com.jme3.network.kernel.tcp.SelectorKernel@1dbdec3" nov. 22, 2013 8:43:19 PM de.lessvoid.nifty.effects.EffectProcessor startEffect
And
java.lang.NullPointerException at com.jme3.network.kernel.tcp.SelectorKernel.closeEndpoint(SelectorKernel.java:186) at com.jme3.network.kernel.tcp.NioEndpoint.close(NioEndpoint.java:95) at com.jme3.network.kernel.tcp.NioEndpoint.close(NioEndpoint.java:76) at com.jme3.network.base.KernelAdapter.reportError(KernelAdapter.java:125) at com.jme3.network.base.KernelAdapter.flushEvents(KernelAdapter.java:257) at com.jme3.network.base.KernelAdapter.run(KernelAdapter.java:268)

I guess that the line

Exception in thread "com.jme3.network.kernel.tcp.SelectorKernel@1dbdec3" nov. 22, 2013 8:43:19 PM de.lessvoid.nifty.effects.EffectProcessor startEffect
is the result of a concurrent logging from differents threads, otherwise I would find suspicious a nifty stacktrace from a tcp thread ?

It seems that I can’t reproduce the problem concerning threads not stopping…
I got udp/tcp threads twice but it is due to the client and server ones running at the same time.
What happened until now was my logger on active threads count/list increasing by 2 each time I stopped/started the server, with more and more kernel.udp and kernel.tcp.
I will post more logs if I can reproduce it…
But I still have the failure on multiple start/stop.

@Kevril said: is the result of a concurrent logging from differents threads, otherwise I would find suspicious a nifty stacktrace from a tcp thread ?

Well, if you started the server from a nifty screen or something then it’s understandable if it shows up in the stack trace. But nifty does really funny things with logging, I guess because that stack trace looks messed up for other reasons. Either that or you shortened it on cut/paste.

Actually, the nifty part looks like something else entirely. I can’t see the whole log so I can’t tell.

…or the logging could be the different between stdout and stderr… was one set in red and the other in black or something?

Incidentally, what version of JME are you running?

This line:
com.jme3.network.kernel.tcp.SelectorKernel.closeEndpoint(SelectorKernel.java:186)

…doesn’t correspond to an actual line of code.

But the closest thing I can see to a cause was that the connection was being closed without ever having been initialized. Perhaps as the result of an earlier error.

I use the SDK 3.0RC2.

Everything here is logged to stderr, and I put the whole relevant part of the output… I don’t find any earlier error.

But my bad, I mixed a bit two differents cases and that’s why I wasn’t able to reproduce the multiplication of kernel threads. ^^

First case :
I don’t try to manually force kernels to close after calling server.stop().
Thread list increase each time I start/stop.
An exemple of the thread list right befor it crashes :

[Thread[Finalizer,8,system], Thread[DestroyJavaVM,5,main], Thread[Thread-2,5,main], Thread[com.jme3.network.kernel.tcp.SelectorKernel@447828,5,main], Thread[Reference Handler,10,system], Thread[jME3 Audio Thread,6,main], Thread[com.jme3.network.kernel.tcp.SelectorKernel@1ea6b4a,5,main], Thread[Java2D Disposer,10,system], Thread[com.jme3.network.kernel.tcp.SelectorKernel@1a8402a,5,main], Thread[com.jme3.network.kernel.udp.UdpKernel@10ca60c,5,main], Thread[Signal Dispatcher,9,system], Thread[Attach Listener,5,system], Thread[LWJGL Renderer Thread,5,main], Thread[Dispatcher-1,5,main], Thread[Dispatcher-0,5,main], Thread[com.jme3.network.kernel.udp.UdpKernel@12cbe8c,5,main], Thread[com.jme3.network.kernel.udp.UdpKernel@129d69e,5,main], Thread[Timer-2,5,main]]
There are 3 tcp and 3 udp kernel threads, remaining after 3 sucessful start/stop. The fourth crashed. Clients threads seem to close as expected, but server ones do not. In this case, I only have the following error :
SEVERE: null com.jme3.network.kernel.KernelException: Error hosting:0.0.0.0/0.0.0.0:6143 at com.jme3.network.kernel.tcp.SelectorKernel.initialize(SelectorKernel.java:98) at com.jme3.network.base.KernelAdapter.initialize(KernelAdapter.java:101) at com.jme3.network.base.DefaultServer.start(DefaultServer.java:158)

(stack trace corresponding from here to the call hierarchy before invoking code lines I show on firs post)

Caused by: java.net.BindException: Address already in use: bind
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:344)
at sun.nio.ch.Net.bind(Net.java:336)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:199)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:67)
at com.jme3.network.kernel.tcp.SelectorKernel$SelectorThread.connect(SelectorKernel.java:251)
at com.jme3.network.kernel.tcp.SelectorKernel.initialize(SelectorKernel.java:95)
… 47 more


Nothing more.

Second case :
If I force kernels threads to close, thread count remains stable, and I also get the errors I posted before about closeEndPoint().
I took care of forcing the threads interruption only after calling the close() methods. But there may be internal callbacks ?.. Don’t know.

The ‘random’ behavior of my probleme make me think it’s a threading issue, but… :confused:

How are you “forcing threads to close”? The only safe way to kill a thread in Java is System.exit().

It might help if you trace through the code in the debugger and find out why the server is not getting closed.

I mean… follow me on a trip here…

DefaultServer.close()
http://code.google.com/p/jmonkeyengine/source/browse/trunk/engine/src/networking/com/jme3/network/base/DefaultServer.java#175

Calls KernelAdapter.close() on every channel:
http://code.google.com/p/jmonkeyengine/source/browse/trunk/engine/src/networking/com/jme3/network/base/KernelAdapter.java#109

This in turn calls terminate() on the message kernel for that channel:
http://code.google.com/p/jmonkeyengine/source/browse/trunk/engine/src/networking/com/jme3/network/kernel/tcp/SelectorKernel.java#101

Which calls close() on that thread:
http://code.google.com/p/jmonkeyengine/source/browse/trunk/engine/src/networking/com/jme3/network/kernel/tcp/SelectorKernel.java#259

…which is supposed to wait for the thread. If there is no error then there is no way that I can see it falling through without the thread closing.

…and no relevant errors have been posted so far.

Because this error:
[blockquote]
java.lang.NullPointerException
at com.jme3.network.kernel.tcp.SelectorKernel.closeEndpoint(SelectorKernel.java:186)
at com.jme3.network.kernel.tcp.NioEndpoint.close(NioEndpoint.java:95)
at com.jme3.network.kernel.tcp.NioEndpoint.close(NioEndpoint.java:76)
at com.jme3.network.base.KernelAdapter.reportError(KernelAdapter.java:125)
at com.jme3.network.base.KernelAdapter.flushEvents(KernelAdapter.java:257)
at com.jme3.network.base.KernelAdapter.run(KernelAdapter.java:268)
[/blockquote]

Is not a real line number:
http://code.google.com/p/jmonkeyengine/source/browse/trunk/engine/src/networking/com/jme3/network/kernel/tcp/SelectorKernel.java#186 In fact, I have to go back to Dec 2011 to see that line: http://code.google.com/p/jmonkeyengine/source/browse/trunk/engine/src/networking/com/jme3/network/kernel/tcp/SelectorKernel.java?r=8944#186

…and the closest line (and only line in that method) would indicate that whatever connection threw that error was never started… or had already been closed once.

Just strange.

@Kevril said: I use the SDK 3.0RC2.

Note: I don’t think networking changed much since then but rc2 is pretty old by now and 3.0 final was released a few months ago and we’re already up to 3.0.3 or 3.0.4.

I trust you but I downloaded the latest SDK from the official JME website around 3 months ago (max). Strange…

I am not sure to have much time for the debug you ask until this evening, but I will do it as soon as possible.
Thank you anyway for your help. :slight_smile:

By forcing threads to interrupt I mean, for testing purpose (I know that it isn’t really safe) :

for(Thread t : Thread.getAllStackTraces().keySet()){ if(t.getName().contains("com.jme3.network.kernel.tcp.SelectorKernel")){ t.interrupt(); } else if(t.getName().contains("com.jme3.network.kernel.udp.UdpKernel")){ t.interrupt(); } }
I thought it would maybe free ports that were still in use, but it does not.
@Kevril said: I trust you but I downloaded the latest SDK from the official JME website around 3 months ago (max). Strange...

That might have been before the release. Did you ever update it to the latest stable? Otherwise that version might be like a year old without auto-updates.

I’m still stumped regarding your other issues. Without exceptions being thrown, I can’t see a way that the sockets would still be bound or the threads would still be running.

For:
com.jme3.network.kernel.KernelException: Error hosting:0.0.0.0/0.0.0.0:6143

Is that the main port you launch the server on or does your server setup multiple channels?

I updated to 3.0 stable today.

The problem is a bit harder to reproduce, but still present.
I tried debugging from server.close(), and no error is thrown. But what happen is that when the server loops on channels to close them, it effectively loops on the SelectorKernel and UdpKernel. But threads that are closed are Selector and UDP host, not Kernels. It also wait(0) on UDP Host and Selector to interrupt, not Kernels… So they stay alive if I don’t interrupt them.

I don’t setup multiple channels, only hosting on 6143.

There is something else wrong in my code anyway, my serverMain (my ‘server component’ entry point) should be null at some point and it isn’t before it crashes… So it dodges its stop() method and that is why next start crashes.
I am looking in this direction for now.
I probably look so draft when I post something here x’)

Just wondering, is there a way to enforce ports in use to be free ? What ‘owns’ the ports ? (curiosity speaking, I am not dirty enough to try it on my app ^^ )

@Kevril said: I tried debugging from server.close(), and no error is thrown. But what happen is that when the server loops on channels to close them, it effectively loops on the SelectorKernel and UdpKernel. But threads that are closed are Selector and UDP host, not Kernels. It also wait(0) on UDP Host and Selector to interrupt, not Kernels... So they stay alive if I don't interrupt them.

I don’t understand this exactly. Do you mean that the kernel adapter threads are not closed? The one in KernelAdapter.java?

If that’s true then I guess an EVENTS_PENDING message is never queued. I can fix that… but as you discovered it is not the cause of your bind exception. Those threads are harmless other than the resources they take up.

That’s it, NIO level is shut down as expected, but neither thread SelectorKernel nor UdpKernel are.

Concerning my issue, as it seems to be closely linked to my implementation, I will do some more digging.
I actually can’t isolate where exactly does the issue come from.

I will post the explanation when I find it… someday. x’)
Thank you pspeed. :slight_smile:

I had this same issue which I posted. Unfortunately, I could never figure it out but the code you supplied @Cratere for ending threads is a lot easier to manage than starting up another process (and it works for me at least). Did you ever find anything else on this subject?

I have the same issue, using jME 3.0 stable. I try to close the server via:

server.close();

… and the following 2 threads stay alive:

  • com.jme3.network.kernel.tcp.SelectorKernel@339ec220,5,main
  • com.jme3.network.kernel.udp.UdpKernel@7bdadd59,5,main

Enabling the log doesn’t help as no error or warning is printed. Any ideas on this? I really have no idea what else to call besides the close method. I also searched for an error listener or smth. on the server as there is one on the client.

Do you close all of the connections first or not?

No, I close the server before the connections are closed.