Server.close() doesn't

I’m making a call to com.jme3.network.Server.close() like this:



[java]

@Override

public void destroy() {

logger.log(Level.FINE, “— destroy() —”);

super.destroy();

server.close();

}

[/java]



But when server.close() is called the server thread doesn’t stop. I ran it in the debugger and traced it to SelectorKernel.java line 269. It’s where SelectorThread.close() makes a call to join(), waiting for the thread to finish. But it doesn’t seem to be finishing.



Anyone else run into this? I feel like I’m missing something obvious…

Are there any active connections at the time?

@pspeed said:
Are there any active connections at the time?


No. I had created a client connection, but I closed it before closing the server. To confirm, I added the following:

[java]
@Override
public void destroy() {
logger.log(Level.FINE, "--- destroy() ---");
super.destroy();
logger.log(Level.FINE, "Connections: {0}", server.getConnections());
server.close();
}
[/java]

The output of the logging statement was:
"Connections: []"

I’m not sure what’s happening then… normally everything shuts down normally.



What OS are you running on? If you are running in a debugger then it would be interesting to know what happens in that thread’s run() loop… as closing the connection should have caused that loop to jump out.

I’m running on Mac OS X 10.6.8 (x86_64). Java 1.6.0_29.



I’ll put a breakpoint in the thread’s run() loop and get back to you.

Okay, in testing this (and running it again and again), the behavior is inconsistent. Sometimes it closes cleanly, but often it doesn’t.



I put breakpoints on the following lines in the while loop:



453 - select()

455 - if statement in the ClosedSelectorException catch block

459 - if statement in the IOException catch block

464 - end of the run() method



When the app fails to close, select() gets called and execution never hits the breakpoints beyond that. Apparently it’s blocking, never returning, and never throwing a ClosedSelectorException or an IOException.



Selector is an abstract class; what you actually get is system-dependent, so it could be that we’re running into a bug (or at least a difference in implementations) in the Selector. Dunno. It’s late and I’ve been looking at code for way too long, so my thinking is fuzzy. I’ll come back to this, later.

I’ve run the app on Windows and am unable to reproduce the behavior, there, only on Mac. Mac OS X supports two different selector implementations (you switch them with a system property), but the problem exists with both of them. (See here for the Mac OS X source code.)



On Windows, the polling that takes place inside Selector.select() throws an IOException. On Mac (and also on Sun/Solaris, from the look of things, although I can’t test that) polling does not throw an IOException.



One could consider this a bug in the platform-specific Java implementation, but even that is a bit fuzzy. As a work-around that removes the need to rely on all the platform implementations throwing exceptions the same way, perhaps we could have a system property that allows you to set a timeout for the polling. Instead of selector.select(), call selector.select(getTimeout()). Then have getTimeout() return the default (i.e., 0), or a timeout value that has been overridden with a system property. Do you see any potential problems with something like that?



Just to document it, here are the classes I get on different platforms:



On Windows, you get:



On Mac (with java.nio.preferSelect set to false, which is the default)

On Mac (with java.nio.preferSelect set to true)


On Linux:
1 Like

When could also try interrupting the thread… which is a bit like taking a sledge hammer to the problem. A select for a closed connection should fail.



…but if interrupting doesn’t work then those platforms are definitely busted. You shouldn’t have to poll and add ugly delays, etc…

Well, select() will fail if the connection is closed when select() begins. Unfortunately, select() can check the channel, find it good, and continue on until it hits the internal (native) call that blocks. Then you’re dependent on how that blocking call is natively implemented.



Actually, it looks like the current Server.close() implementation is only working correctly on Windows. I just tested it on Linux 2.6 and I get the same behavior as on Mac OS X (and presumably Sun/Solaris, which I don’t have so can’t actually test).



Here’s the test app:



[java]

package mygame;



import com.jme3.app.SimpleApplication;

import com.jme3.network.Network;

import com.jme3.network.Server;

import com.jme3.system.JmeContext.Type;

import java.util.logging.Level;

import java.util.logging.Logger;





public class Main extends SimpleApplication {



private static Logger logger = Logger.getLogger(Main.class.getName());



private Server server;



public static void main(String[] args) {

Main app = new Main();

app.start(Type.Headless);

}



@Override

public void simpleInitApp() {

try {

server = Network.createServer(6677);

server.start();

logger.log(Level.INFO, “Server started. App Stopping in 5 seconds…”);

Thread.sleep(5000);

stop();

} catch (Exception ex) {

logger.log(Level.SEVERE, null, ex);

}





}



@Override

public void destroy() {

logger.log(Level.INFO, “In the destroy() method…”);

super.destroy();

logger.log(Level.INFO, “Calling server.close()…”);

server.close();

logger.log(Level.INFO, “Leaving the destroy() method…”);

}



}

[/java]

I was just searching around for any discussions on the implementation of Selectors on different platforms, and ran across this amusing tidbit:



If you look closely at the NIO documentation you'll come across the occasional mention of naive implementations blocking where more efficient implementations might not, usually in the context of altering the state of the selector from another thread. If you plan on writing code against the NIO libraries that must run on multiple platforms you have to assume the worst. This isn't just hypothetical either, a little experimentation should be enough to convince you that Sun's Linux implementation is "naive". If you plan on targeting one platform only feel free to ignore this advice but I'd recommend against it. The thing about code is that it oftens ends up in the oddest of places.

As a result, if you plan to hang onto your sanity don't modify the selector from any thread other than the selecting thread. This includes modifying the interest ops set for a selection key, registering new channels with the selector, and cancelling existing channels.


Heh. Apparently we're not the first to have run into this particular problem. :D

Okay, I think I have an easy fix.



Change SelectorKernel$SelectorThread.close() to the following:



[java]

public void close() throws IOException, InterruptedException

{

// Set the thread to stop

go.set(false);



// Make sure the channel is closed

serverChannel.close();



// Force the selector to stop blocking

wakeupSelector();



// And wait for it

join();

}

[/java]



EDIT: I built a new jMonkeyEngine3.jar with this change and tested it. Multiple tests on Windows, Mac, and Linux are all working and closing as expected. 8)

2 Likes

Yay… I’ll get this one in right away.

Fixed in SVN.