Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java process on Mac OSX does not release socket

I am experiencing an odd problem every now and then (too often actually).

I am running a server application, which is binding a socket for itself.

But once in a while, the socket is not released. The process dies, although Eclipse reports that Terminate failed, however it disappears properly from 'ps' and JConsole/JVisualVM. 'lsof' also displays nothing for the port anymore. But still, I get this error when I try to start the server again to the same port:

Caused by: java.net.BindException: Address already in use
    at sun.nio.ch.Net.bind(Native Method)
    at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:126)
    at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59)

The problem is worst in my unit tests, which never run fully, because this will for sure occur after one of the tests (which all recreate the server).

I am running MacOSX 10.7.3

Java(TM) SE Runtime Environment (build 1.6.0_31-b04-415-11M3635) Java HotSpot(TM) 64-Bit Server VM (build 20.6-b01-415, mixed mode)

I have also Parallels, and often the problem looks like it's caused by the Parallels network adapter, but I am not sure if it has anything to do with this problem after all (I have contacted their support without any help so far).

The only thing that helps to resolve the situation is to reboot OSX.

Any ideas?

--

This is the relevant code to open the socket:

channel = (ServerSocketChannel) ServerSocketChannel.open().configureBlocking(false);
 channel.socket().bind( addr, 0 );

and it is closed by

  channel.close();

But I assume that the process gets stuck here and then Eclipse kills it.

--

netstat -an (for port 6007):

tcp4      73      0  127.0.0.1.6007         127.0.0.1.51549        ESTABLISHED
tcp4       0      0  127.0.0.1.51549        127.0.0.1.6007         ESTABLISHED
tcp4      73      0  127.0.0.1.6007         127.0.0.1.51544        CLOSE_WAIT 
tcp4       0      0  127.0.0.1.6007         127.0.0.1.51543        CLOSE_WAIT 
tcp4       0      0  10.37.129.2.6007       *.*                    LISTEN     
tcp4       0      0  10.211.55.2.6007       *.*                    LISTEN     
tcp4       0      0  127.0.0.1.6007         *.*                    LISTEN     
tcp4       0      0  10.50.100.236.6007     *.*                    LISTEN     

--

And now I get this exception after the socket is opened for every test (netstat output from this situation):

Caused by: java.net.SocketTimeoutException: Read timed out
    at java.net.SocketInputStream.socketRead0(Native Method)
    at java.net.SocketInputStream.read(SocketInputStream.java:129)
    at java.net.SocketInputStream.read(SocketInputStream.java:182)

--

Stopping the process from eclipse I got "Terminate failed", but lsof -i TCP:6007 is displaying nothing and the process is no longer found by 'ps'. netstat output did not change...

Can I somehow kill the socket without rebooting (that would help a litte bit already)?

--

UPDATE 5.5.12:

I ran the tests now in Eclipse debugger. This time the tests got stuck after 18 methods. I stopped the main thread after it was stuck around 15 minutes. This is the stack:

Thread [main] (Suspended)   
    FileDispatcher.preClose0(FileDescriptor) line: not available [native method]    
    SocketDispatcher.preClose(FileDescriptor) line: 41  
    ServerSocketChannelImpl.implCloseSelectableChannel() line: 208 [local variables unavailable]    
    ServerSocketChannelImpl(AbstractSelectableChannel).implCloseChannel() line: 201 
    ServerSocketChannelImpl(AbstractInterruptibleChannel).close() line: 97  
...

--

Hmm, it looks like the process is not killed, after all - and does not die to kill -9 either (I noticed that process 712 and probably also 710 are the TestNG processes):

$ kill -9 712
$ ps xa | grep java
  700   ??  ?E     0:00.00 (java)
  712   ??  ?E     0:00.00 (java)
  797 s005  S+     0:00.00 grep java

-- Edit: 10.5.12:

?E in the ps output above means that the process is exiting. I could not find any means to kill such a process fully without rebooting. The same issue has been noticed with some other applications. No solutions found:

http://www.google.com/search?q=ps+process+is+exiting+osx

like image 552
Jouni Aro Avatar asked Apr 25 '12 09:04

Jouni Aro


3 Answers

try closing the socket with http://docs.oracle.com/javase/1.4.2/docs/api/java/net/ServerSocket.html#close() after each test, in the teardown, if you're not already.

like image 126
grahaminn Avatar answered Nov 12 '22 10:11

grahaminn


Just a shot in the dark here, but make sure that any thread that is waiting on Selector.select() has been woken up, and has exited.

like image 29
Sam Goldberg Avatar answered Nov 12 '22 10:11

Sam Goldberg


So it seems that the problem lies in the implementation of Selector in the Mac version of JDK 6. Installing the new Oracle JDK 7u4 fixes the issue, independent of how the Selector is used.

like image 2
Jouni Aro Avatar answered Nov 12 '22 08:11

Jouni Aro