Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cassandra nodes cannot see each other when internode encryption is enabled

I had set up a 6-node Cassandra cluster spanning two AWS regions / datacenters (3 in each) and everything was working fine. After getting that much working I attempted to enable internode encryption which I cannot get to work properly, despite reading innumerable documents on the subject and fiddling endlessly.

I don't see any errors or anything out of the ordinary in the logs. I do see the following line in the logs which indicates it has started the encrypted messaging service, as expected:

MessagingService.java:482 - Starting Encrypted Messaging Service on SSL port 7001

I have enabled verbose logging for SSL in cassandra-env.sh, however this does not produce any errors or additional information about SSL internode connections that I can see (update below):

JVM_OPTS="$JVM_OPTS -Djavax.net.debug=ssl"

I can connect to from one node to all the others on the encrypted messaging port 7001 using nc, so there's no firewall issue.

ubuntu@ip-5-6-7-8:~$ nc -v 1.2.3.4 7001
Connection to 1.2.3.4 7001 port [tcp/afs3-callback] succeeded!

I can connect to each node locally using cqlsh (I haven't enabled client-server encryption) and can query the system keyspace, etc.

However, if I run nodetool status I see that the nodes cannot see each other. Only the node that I'm querying the cluster on is present in the list. This was not the case before internode encryption was enabled, they could all see each other just fine then.

ubuntu@ip-5-6-7-8:~$ nodetool status
Datacenter: us-east_A
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens       Owns    Host ID                               Rack
UN  1.2.3.4        144.75 KB  256          ?       992ae1bc-77e4-4ab1-a18f-4db62bb0ce6f  1b

My process was this:

  • Created a certificate authority for my cluster
  • Created a keystore and truststore for each node and added my CA certificate chain to both
  • Generated a key pair and CSR for each node, signed it with my CA, and added the resulting certificate to each node's keystore
  • Updated each node's configuration as reads below
  • Restarted all nodes

The server encryption configuration I'm using is this, with the appropriate values in the $variables.

server_encryption_options:
    internode_encryption: all
    keystore: $keystore_path
    keystore_password: $keystore_passwd
    truststore: $truststore_path
    truststore_password: $truststore_passwd
    require_client_auth: true
    protocol: TLS
    algorithm: SunX509
    store_type: JKS
    cipher_suites: [TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_DHE_RSA_WITH_AES_128_CBC_SHA,TLS_DHE_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA]

If anybody could offer some insight or a direction to look in it would be greatly appreciated.

Update: Cipher Suite Agreement

Apparently SSL debug logging prints to stdout, which is not logged to Cassandra's logfiles, so I didn't see that output before. Running Cassandra in the foreground I can see a ton of SSL errors tracing out, all of which complain of handshake failure, because:

javax.net.ssl.SSLHandshakeException: no cipher suites in common

In an attempt to solve this problem I have switched to the Oracle JRE (I was being lazy and using OpenJDK before) and installed the JCE unlimited strength cryptography policy files to ensure all possible ciphers would be supported.

It didn't fix anything.

This is especially confusing given that all these nodes are exactly identical: hardware, OS vendor and version, Java vendor and version, Cassandra version, and configuration file. I cannot imagine why they cannot agree on a cipher suite under these circumstances.

The following is the full error that is traced:

*** ClientHello, TLSv1.2
RandomCookie:  GMT: 1449074039 bytes = { 205, 93, 27, 38, 184, 219, 250, 8, 232, 46, 117, 84, 69, 53, 225, 16, 27, 31, 3, 7, 203, 16, 133, 156, 137, 231, 238, 39 }
Session ID:  {}
Cipher Suites: [TLS_RSA_WITH_AES_256_CBC_SHA, TLS_DHE_RSA_WITH_AES_128_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA, TLS_RSA_WITH_AES_128_CBC_SHA, TLS_DHE_RSA_WITH_AES_256_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA, TLS_EMPTY_RENEGOTIATION_INFO_SCSV]
Compression Methods:  { 0 }
***
%% Initialized:  [Session-3, SSL_NULL_WITH_NULL_NULL]
%% Invalidated:  [Session-3, SSL_NULL_WITH_NULL_NULL]
ACCEPT-/1.2.3.4, SEND TLSv1.2 ALERT:  fatal, description = handshake_failure
ACCEPT-/1.2.3.4, WRITE: TLSv1.2 Alert, length = 2
ACCEPT-/1.2.3.4, called closeSocket()
ACCEPT-/1.2.3.4, handling exception: javax.net.ssl.SSLHandshakeException: no cipher suites in common
ACCEPT-/1.2.3.4, called close()
ACCEPT-/1.2.3.4, called closeInternal(true)
INFO  16:33:59 Waiting for gossip to settle before accepting client requests...
Allow unsafe renegotiation: false
Allow legacy hello messages: true
Is initial handshake: true
Is secure renegotiation: false
ACCEPT-/1.2.3.4, setSoTimeout(10000) called
ACCEPT-/1.2.3.4, READ:  SSL v2, contentType = Handshake, translated length = 57
like image 975
BWW Avatar asked Dec 02 '15 15:12

BWW


Video Answer


1 Answers

After a great deal more poking and prodding I've finally managed to get this to work. The problem was related to certificates and the keystore.

As a result of these problems the SSL handshake would fail either due to certificate chain problems or cipher suite agreement problems. Cassandra rather unhelpfully discards errors related to SSL and logs nothing.

In any case, I managed to get things working by doing the following:

  • Ensure that the CA generates node certificates with both client and server key usage attributes. Failing to include one or the other will prevent nodes from authenticating to each other properly. This presents itself as the cipher suite agreement error. If you're using OpenSSL to manage your CA, I've included the -extensions configuration I used below.
  • Ensure that both the root and any intermediate CA certificates you are using (if you're using an intermediary CA) are imported into both the keystore and truststore.
  • Ensure that the node certificate imported into the keystore includes the full trust chain from the primary certificate down to the CA root, including any intermediaries – even though you have already imported these CA certificates separately into the keystore. Failing to do this presents itself as an invalid certificate chain errors.

OpenSSL CA Config

Here's my extensions section for dual-role client/server certificates. You can include this in your OpenSSL config file and reference it when signing by specifying -extensions dual_cert.

[ dual_cert ]
# Extensions for dual-role user/server certificates (`man x509v3_config`).
basicConstraints = CA:FALSE
nsCertType = client, server
nsComment = "Client/Server Dual-role Certificate"
subjectKeyIdentifier = hash
authorityKeyIdentifier = keyid,issuer:always
keyUsage = critical, nonRepudiation, digitalSignature, keyEncipherment
extendedKeyUsage = clientAuth, serverAuth

Creating a PEM containing the full trust chain

To create a single PEM file which contains the full trust chain for your node certificate, simply cat all the certificate files in reverse order from the node certificate down to the CA root.

cat node1.crt ca-intermediate.crt ca-root.crt > node1-full-chain.crt
like image 115
BWW Avatar answered Sep 19 '22 12:09

BWW