I had set up a 6-node Cassandra cluster spanning two AWS regions / datacenters (3 in each) and everything was working fine. After getting that much working I attempted to enable internode encryption which I cannot get to work properly, despite reading innumerable documents on the subject and fiddling endlessly.
I don't see any errors or anything out of the ordinary in the logs. I do see the following line in the logs which indicates it has started the encrypted messaging service, as expected:
MessagingService.java:482 - Starting Encrypted Messaging Service on SSL port 7001
I have enabled verbose logging for SSL in cassandra-env.sh
, however this does not produce any errors or additional information about SSL internode connections that I can see (update below):
JVM_OPTS="$JVM_OPTS -Djavax.net.debug=ssl"
I can connect to from one node to all the others on the encrypted messaging port 7001 using nc
, so there's no firewall issue.
ubuntu@ip-5-6-7-8:~$ nc -v 1.2.3.4 7001
Connection to 1.2.3.4 7001 port [tcp/afs3-callback] succeeded!
I can connect to each node locally using cqlsh
(I haven't enabled client-server encryption) and can query the system keyspace, etc.
However, if I run nodetool status
I see that the nodes cannot see each other. Only the node that I'm querying the cluster on is present in the list. This was not the case before internode encryption was enabled, they could all see each other just fine then.
ubuntu@ip-5-6-7-8:~$ nodetool status
Datacenter: us-east_A
=====================
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
-- Address Load Tokens Owns Host ID Rack
UN 1.2.3.4 144.75 KB 256 ? 992ae1bc-77e4-4ab1-a18f-4db62bb0ce6f 1b
My process was this:
The server encryption configuration I'm using is this, with the appropriate values in the $variables
.
server_encryption_options:
internode_encryption: all
keystore: $keystore_path
keystore_password: $keystore_passwd
truststore: $truststore_path
truststore_password: $truststore_passwd
require_client_auth: true
protocol: TLS
algorithm: SunX509
store_type: JKS
cipher_suites: [TLS_RSA_WITH_AES_128_CBC_SHA,TLS_RSA_WITH_AES_256_CBC_SHA,TLS_DHE_RSA_WITH_AES_128_CBC_SHA,TLS_DHE_RSA_WITH_AES_256_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA]
If anybody could offer some insight or a direction to look in it would be greatly appreciated.
Apparently SSL debug logging prints to stdout, which is not logged to Cassandra's logfiles, so I didn't see that output before. Running Cassandra in the foreground I can see a ton of SSL errors tracing out, all of which complain of handshake failure, because:
javax.net.ssl.SSLHandshakeException: no cipher suites in common
In an attempt to solve this problem I have switched to the Oracle JRE (I was being lazy and using OpenJDK before) and installed the JCE unlimited strength cryptography policy files to ensure all possible ciphers would be supported.
It didn't fix anything.
This is especially confusing given that all these nodes are exactly identical: hardware, OS vendor and version, Java vendor and version, Cassandra version, and configuration file. I cannot imagine why they cannot agree on a cipher suite under these circumstances.
The following is the full error that is traced:
*** ClientHello, TLSv1.2
RandomCookie: GMT: 1449074039 bytes = { 205, 93, 27, 38, 184, 219, 250, 8, 232, 46, 117, 84, 69, 53, 225, 16, 27, 31, 3, 7, 203, 16, 133, 156, 137, 231, 238, 39 }
Session ID: {}
Cipher Suites: [TLS_RSA_WITH_AES_256_CBC_SHA, TLS_DHE_RSA_WITH_AES_128_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA, TLS_RSA_WITH_AES_128_CBC_SHA, TLS_DHE_RSA_WITH_AES_256_CBC_SHA, TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA, TLS_EMPTY_RENEGOTIATION_INFO_SCSV]
Compression Methods: { 0 }
***
%% Initialized: [Session-3, SSL_NULL_WITH_NULL_NULL]
%% Invalidated: [Session-3, SSL_NULL_WITH_NULL_NULL]
ACCEPT-/1.2.3.4, SEND TLSv1.2 ALERT: fatal, description = handshake_failure
ACCEPT-/1.2.3.4, WRITE: TLSv1.2 Alert, length = 2
ACCEPT-/1.2.3.4, called closeSocket()
ACCEPT-/1.2.3.4, handling exception: javax.net.ssl.SSLHandshakeException: no cipher suites in common
ACCEPT-/1.2.3.4, called close()
ACCEPT-/1.2.3.4, called closeInternal(true)
INFO 16:33:59 Waiting for gossip to settle before accepting client requests...
Allow unsafe renegotiation: false
Allow legacy hello messages: true
Is initial handshake: true
Is secure renegotiation: false
ACCEPT-/1.2.3.4, setSoTimeout(10000) called
ACCEPT-/1.2.3.4, READ: SSL v2, contentType = Handshake, translated length = 57
After a great deal more poking and prodding I've finally managed to get this to work. The problem was related to certificates and the keystore.
As a result of these problems the SSL handshake would fail either due to certificate chain problems or cipher suite agreement problems. Cassandra rather unhelpfully discards errors related to SSL and logs nothing.
In any case, I managed to get things working by doing the following:
-extensions
configuration I used below.Here's my extensions section for dual-role client/server certificates. You can include this in your OpenSSL config file and reference it when signing by specifying -extensions dual_cert
.
[ dual_cert ]
# Extensions for dual-role user/server certificates (`man x509v3_config`).
basicConstraints = CA:FALSE
nsCertType = client, server
nsComment = "Client/Server Dual-role Certificate"
subjectKeyIdentifier = hash
authorityKeyIdentifier = keyid,issuer:always
keyUsage = critical, nonRepudiation, digitalSignature, keyEncipherment
extendedKeyUsage = clientAuth, serverAuth
To create a single PEM file which contains the full trust chain for your node certificate, simply cat
all the certificate files in reverse order from the node certificate down to the CA root.
cat node1.crt ca-intermediate.crt ca-root.crt > node1-full-chain.crt
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With