Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why am i seeing lots of sockets in CLOSE_WAIT status when webservice stops working?

My java webservice running on Jetty falls over after a period of a few hours and investigation indicate many sockets in CLOSE_WAIT status. Whilst it is working ok there seems to be no sockets in CLOSE_WAIT status but when it goes wrong there are loads.

I found this definition

CLOSE-WAIT: The local end-point has received a connection termination request and acknowledged it e.g. a passive close has been performed and the local end-point needs to perform an active close to leave this state.

With netstat on my server I see a list of tcp sockets in CLOSE_WAIT status, the local address is my server and the foreign address my load balancer machine. So I assume this means the client (load balancer) has just terminated the connection at its end in some improper way, and my server has not properly closed the connection at its end.

But how do I do that, my Java code doesn't deal with low level sockets ?

Or is the load-balancer terminating connection because of an earlier problem caused by something my server is doing wrong in the code.

like image 401
Paul Taylor Avatar asked Mar 05 '15 10:03

Paul Taylor


People also ask

Why there are many Close_wait sockets seen in netstat command output?

Details. 'CLOSE_WAIT' state on tcp connections occurs if the system has not received a close system call from the application, after having received notification ('FIN' packet) from the other system that it has closed its endpoint.

What happens if there are many Close_wait on a socket?

CLOSE_WAIT - Indicates that the server has received the first FIN signal from the client and the connection is in the process of being closed. This means the socket is waiting for the application to execute close() . A socket can be in CLOSE_WAIT state indefinitely until the application closes it.

What is the meaning of Close_wait in netstat?

CLOSE_WAIT means your program is still running, and hasn't closed the socket (and the kernel is waiting for it to do so). Add -p to netstat to get the pid, and then kill it more forcefully (with SIGKILL if needed).

How do I remove a Close_wait socket connection?

When one side closes the connection, the socket at the other side changes to the CLOSE_WAIT state. So, the CLOSE_WAIT state means the socket is closed on the remote side, and the system is waiting for the local side to close it. Then, the only way to remove the CLOSE_WAIT socket connection is to close it.


2 Answers

Sounds like a bug in Jetty or JVM, maybe this workaround will work for you: http://www.tux.hk/index.php?entry=entry090521-111844

Add the following lines to /etc/sysctl.conf

net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_keepalive_intvl = 2
net.ipv4.tcp_keepalive_probes = 2
net.ipv4.tcp_keepalive_time = 1800

And then execute

sysctl -p

or do a reboot

like image 197
Eirenliel Avatar answered Sep 27 '22 20:09

Eirenliel


I suspect this could be something causing a long or infinite loop/infinite wait in your server code, and Jetty simply never gets a chance to close the connection (unless there's some sort of timeout that forcibly closes the socket after a certain period). Consider the following example:

public class TestSocketClosedWaitState
{
    private static class SocketResponder implements Runnable
    {
        private final Socket socket;

        //Using static variable to control the infinite/waiting loop for testing purposes, with while(true) Eclipse would complain of dead code in writer.close() -line
        private static boolean infinite = true;

        public SocketResponder(Socket socket)
        {
            this.socket = socket;
        }       

        @Override
        public void run()
        {
            try
            {               
                PrintWriter writer = new PrintWriter(socket.getOutputStream()); 
                writer.write("Hello");              

                //Simulating slow response/getting stuck in an infinite loop/waiting something that never happens etc.
                do
                {
                    Thread.sleep(5000);
                }
                while(infinite);

                writer.close(); //The socket will stay in CLOSE_WAIT from server side until this line is reached
            }
            catch(Exception e)
            {
                e.printStackTrace();
            }           

            System.out.println("DONE");
        }
    }

    public static void main(String[] args) throws IOException
    {
        ServerSocket serverSocket = new ServerSocket(12345);

        while(true)
        {
            Socket socket = serverSocket.accept();
            Thread t = new Thread(new SocketResponder(socket));
            t.start();
        }       
    }
}

With the infinite-variable set to true, the Printwriter (and underlying socket) never gets closed due to infinite loop. If I run this and connect to the socket with telnet, then quit the telnet-client, netstat will show the server side-socket still in CLOSE_WAIT -state (I could also see the client-side socket in FIN_WAIT2-state for a while, but it'll disappear):

~$ netstat -anp | grep 12345
tcp6       0      0 :::12345        :::*            LISTEN      6460/java       
tcp6       1      0 ::1:12345       ::1:34606       CLOSE_WAIT  6460/java   

The server-side accepted socket gets stuck in the CLOSE_WAIT -state. If I check the thread stacks for the process, I can see the thread waiting inside the do...while -loop:

~$ jstack 6460

<OTHER THREADS>

"Thread-0" prio=10 tid=0x00007f424013d800 nid=0x194f waiting on condition [0x00007f423c50e000]
   java.lang.Thread.State: TIMED_WAITING (sleeping)
    at java.lang.Thread.sleep(Native Method)
    at TestSocketClosedWaitState$SocketResponder.run(TestSocketClosedWaitState.java:32)
    at java.lang.Thread.run(Thread.java:701)

<OTHER THREADS...>

If I set the infinite-variable to false, and do the same (connect client & disconnect), the socket with CLOSE_WAIT -state will show until the writer is closed (closing the underlying socket), and then disappears. If the writer or socket is never closed, the server-side socket will again get stuck in CLOSED_WAIT, even if the thread terminates (I don't think this should occur in Jetty, if your method returns at some point, Jetty probably should take care of closing the socket).

So, steps I'd suggest you to try and find the culprit are

  • Add logging to your methods to see where there are going/what they are doing
  • Check your code, are there any places where the execution could get stuck in an infinite loop or take a really long while, preventing the underlying socket from being closed?
  • If it still occurs, take a thread dump from the running Jetty-process with jstack the next time this problem occurs and try to identify any "stuck" threads
  • Is there a chance something might throw something (OutOfMemoryError or such) that might not get caught by the underlying Jetty-architecture calling your method? I've never peeked inside Jetty's internals, it could very well be catching Throwables, so this is probably not the issue, but maybe worth checking if all else fails

You could also name the threads when they enter and exit your methods with something like

        String originalName = Thread.currentThread().getName();
        Thread.currentThread().setName("myMethod");

        //Your code...

        Thread.currentThread().setName(originalName);

to spot them easier if there are a lot of threads running.

like image 24
esaj Avatar answered Sep 27 '22 19:09

esaj