Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SSH Server Identification never received - Handshake Deadlock [SSHJ]

We're having some trouble trying to implement a Pool of SftpConnections for our application.

We're currently using SSHJ (Schmizz) as the transport library, and facing an issue we simply cannot simulate in our development environment (but the error keeps showing randomly in production, sometimes after three days, sometimes after just 10 minutes).

The problem is, when trying to send a file via SFTP, the thread gets locked in the init method from schmizz' TransportImpl class:

   @Override
    public void init(String remoteHost, int remotePort, InputStream in, OutputStream out)
            throws TransportException {
        connInfo = new ConnInfo(remoteHost, remotePort, in, out);

    try {

        if (config.isWaitForServerIdentBeforeSendingClientIdent()) {
            receiveServerIdent();
            sendClientIdent();
        } else {
            sendClientIdent();
            receiveServerIdent();
        }


        log.info("Server identity string: {}", serverID);

    } catch (IOException e) {
        throw new TransportException(e);
    }

    reader.start();
}

isWaitForServerIdentBeforeSendingClientIdent is FALSE for us, so first of all the client (we) send our identification, as appears in logs:

"Client identity String: blabla"

Then it's turn for the receiveServerIdent:

    private void receiveServerIdent() throws IOException 
{
        final Buffer.PlainBuffer buf = new Buffer.PlainBuffer();
        while ((serverID = readIdentification(buf)).isEmpty()) {
            int b = connInfo.in.read();
            if (b == -1)
                throw new TransportException("Server closed connection during identification exchange");
            buf.putByte((byte) b);
        }
    }

The thread never gets the control back, as the server never replies with its identity. Seems like the code is stuck in this While loop. No timeouts, or SSH exceptions are thrown, my client just keeps waiting forever, and the thread gets deadlocked.

This is the readIdentification method's impl:

private String readIdentification(Buffer.PlainBuffer buffer)
        throws IOException {
    String ident = new IdentificationStringParser(buffer, loggerFactory).parseIdentificationString();
    if (ident.isEmpty()) {
        return ident;
    }

    if (!ident.startsWith("SSH-2.0-") && !ident.startsWith("SSH-1.99-"))
        throw new TransportException(DisconnectReason.PROTOCOL_VERSION_NOT_SUPPORTED,
                                     "Server does not support SSHv2, identified as: " + ident);

    return ident;
}

Seems like ConnectionInfo's inputstream never gets data to read, as if the server closed the connection (even if, as said earlier, no exception is thrown).

I've tried to simulate this error by saturating the negotiation, closing sockets while connecting, using conntrack to kill established connections while the handshake is being made, but with no luck at all, so any help would be HIGHLY appreciated.

: )

like image 620
aran Avatar asked Jan 13 '17 08:01

aran


2 Answers

I bet following code creates a problem:

String ident = new IdentificationStringParser(buffer, loggerFactory).parseIdentificationString();
if (ident.isEmpty()) {
    return ident;
}

If the IdentificationStringParser.parseIdentificationString() returns empty string, it will be returned to the caller method. The caller method will keep calling the while ((serverID = readIdentification(buf)).isEmpty()) since the string is always empty. The only way to break the loop would be if call to int b = connInfo.in.read(); returns -1... but if server keeps sending the data (or resending the data) this condition is never met.

If this is the case I would add some kind of artificial way to detect this like:

private String readIdentification(Buffer.PlainBuffer buffer, AtomicInteger numberOfAttempts)
        throws IOException {
    String ident = new IdentificationStringParser(buffer, loggerFactory).parseIdentificationString();

    numberOfAttempts.incrementAndGet();


    if (ident.isEmpty() && numberOfAttempts.intValue() < 1000) { // 1000 
        return ident;
    } else if (numberOfAttempts.intValue() >= 1000) {
        throw new TransportException("To many attempts to read the server ident").

    }

    if (!ident.startsWith("SSH-2.0-") && !ident.startsWith("SSH-1.99-"))
        throw new TransportException(DisconnectReason.PROTOCOL_VERSION_NOT_SUPPORTED,
                                     "Server does not support SSHv2, identified as: " + ident);

    return ident;
}

This way you would at least confirm that this is the case and can dig further why .parseIdentificationString() returns empty string.

like image 148
walkeros Avatar answered Nov 14 '22 06:11

walkeros


Faced a similar issue where we would see:

INFO [net.schmizz.sshj.transport.TransportImpl : pool-6-thread-2] - Client identity string: blablabla

INFO [net.schmizz.sshj.transport.TransportImpl : pool-6-thread-2] - Server identity string: blablabla

But on some occasions, there were no server response. Our service would typically wake up and transfer several files simultaneously, one file per connection / thread.

The issue was in the sshd server config, we increased maxStartups from default value 10 (we noticed the problems started shortly after batch sizes increased to above 10)

Default in /etc/ssh/sshd_config:

MaxStartups 10:30:100

Changed to:

MaxStartups 30:30:100

MaxStartups

Specifies the maximum number of concurrent unauthenticated connections to the SSH daemon. Additional connections will be dropped until authentication succeeds or the LoginGraceTime expires for a connection. The default is 10:30:100. Alternatively, random early drop can be enabled by specifying the three colon separated values start:rate:full (e.g. "10:30:60"). sshd will refuse connection attempts with a probability of rate/100 (30%) if there are currently start (10) unauthenticated connections. The probability increases linearly and all connection attempts are refused if the number of unauthenticated connections reaches full (60).

If you cannot control the server, you might have to find a way to limit your concurrent connection attempts in your client code instead.

like image 28
Moby Avatar answered Nov 14 '22 08:11

Moby