Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is there a handshake failure when trying to run TLS over TLS with this code?

I tried to implement a protocol that can run TLS over TLS using twisted.protocols.tls, an interface to OpenSSL using a memory BIO.

I implemented this as a protocol wrapper that mostly looks like a regular TCP transport, but which has startTLS and stopTLS methods for adding and removing a layer of TLS respectively. This works fine for the first layer of TLS. It also works fine if I run it over a "native" Twisted TLS transport. However, if I try to add a second TLS layer using the startTLS method provided by this wrapper, there's immediately a handshake error and the connection ends up in some unknown unusable state.

The wrapper and the two helpers that let it work looks like this:

from twisted.python.components import proxyForInterface from twisted.internet.error import ConnectionDone from twisted.internet.interfaces import ITCPTransport, IProtocol from twisted.protocols.tls import TLSMemoryBIOFactory, TLSMemoryBIOProtocol from twisted.protocols.policies import ProtocolWrapper, WrappingFactory  class TransportWithoutDisconnection(proxyForInterface(ITCPTransport)):     """     A proxy for a normal transport that disables actually closing the connection.     This is necessary so that when TLSMemoryBIOProtocol notices the SSL EOF it     doesn't actually close the underlying connection.      All methods except loseConnection are proxied directly to the real transport.     """     def loseConnection(self):         pass   class ProtocolWithoutConnectionLost(proxyForInterface(IProtocol)):     """     A proxy for a normal protocol which captures clean connection shutdown     notification and sends it to the TLS stacking code instead of the protocol.     When TLS is shutdown cleanly, this notification will arrive.  Instead of telling     the protocol that the entire connection is gone, the notification is used to     unstack the TLS code in OnionProtocol and hidden from the wrapped protocol.  Any     other kind of connection shutdown (SSL handshake error, network hiccups, etc) are     treated as real problems and propagated to the wrapped protocol.     """     def connectionLost(self, reason):         if reason.check(ConnectionDone):             self.onion._stopped()         else:             super(ProtocolWithoutConnectionLost, self).connectionLost(reason)   class OnionProtocol(ProtocolWrapper):     """     OnionProtocol is both a transport and a protocol.  As a protocol, it can run over     any other ITransport.  As a transport, it implements stackable TLS.  That is,     whatever application traffic is generated by the protocol running on top of     OnionProtocol can be encapsulated in a TLS conversation.  Or, that TLS conversation     can be encapsulated in another TLS conversation.  Or **that** TLS conversation can     be encapsulated in yet *another* TLS conversation.      Each layer of TLS can use different connection parameters, such as keys, ciphers,     certificate requirements, etc.  At the remote end of this connection, each has to     be decrypted separately, starting at the outermost and working in.  OnionProtocol     can do this itself, of course, just as it can encrypt each layer starting with the     innermost.     """     def makeConnection(self, transport):         self._tlsStack = []         ProtocolWrapper.makeConnection(self, transport)       def startTLS(self, contextFactory, client, bytes=None):         """         Add a layer of TLS, with SSL parameters defined by the given contextFactory.          If *client* is True, this side of the connection will be an SSL client.         Otherwise it will be an SSL server.          If extra bytes which may be (or almost certainly are) part of the SSL handshake         were received by the protocol running on top of OnionProtocol, they must be         passed here as the **bytes** parameter.         """         # First, create a wrapper around the application-level protocol         # (wrappedProtocol) which can catch connectionLost and tell this OnionProtocol          # about it.  This is necessary to pop from _tlsStack when the outermost TLS         # layer stops.         connLost = ProtocolWithoutConnectionLost(self.wrappedProtocol)         connLost.onion = self         # Construct a new TLS layer, delivering events and application data to the         # wrapper just created.         tlsProtocol = TLSMemoryBIOProtocol(None, connLost, False)         tlsProtocol.factory = TLSMemoryBIOFactory(contextFactory, client, None)          # Push the previous transport and protocol onto the stack so they can be         # retrieved when this new TLS layer stops.         self._tlsStack.append((self.transport, self.wrappedProtocol))          # Create a transport for the new TLS layer to talk to.  This is a passthrough         # to the OnionProtocol's current transport, except for capturing loseConnection         # to avoid really closing the underlying connection.         transport = TransportWithoutDisconnection(self.transport)          # Make the new TLS layer the current protocol and transport.         self.wrappedProtocol = self.transport = tlsProtocol          # And connect the new TLS layer to the previous outermost transport.         self.transport.makeConnection(transport)          # If the application accidentally got some bytes from the TLS handshake, deliver         # them to the new TLS layer.         if bytes is not None:             self.wrappedProtocol.dataReceived(bytes)       def stopTLS(self):         """         Remove a layer of TLS.         """         # Just tell the current TLS layer to shut down.  When it has done so, we'll get         # notification in *_stopped*.         self.transport.loseConnection()       def _stopped(self):         # A TLS layer has completely shut down.  Throw it away and move back to the         # TLS layer it was wrapping (or possibly back to the original non-TLS         # transport).         self.transport, self.wrappedProtocol = self._tlsStack.pop() 

I have simple client and server programs for exercising this, available from launchpad (bzr branch lp:~exarkun/+junk/onion). When I use it to call the startTLS method above twice, with no intervening call to stopTLS, this OpenSSL error comes up:

OpenSSL.SSL.Error: [('SSL routines', 'SSL23_GET_SERVER_HELLO', 'unknown protocol')] 

Why do things go wrong?

like image 879
Jean-Paul Calderone Avatar asked Feb 26 '11 22:02

Jean-Paul Calderone


People also ask

How do I fix TLS handshake failure?

The fastest way to fix this SSL/TLS handshake error-causing issue is just to reset your browser to the default settings and disable all your plugins. From there, you can configure the browser however you want, testing your connection with the site in question as you tweak things.

What causes TLS handshake to fail?

A TLS/SSL handshake failure occurs when a client and server cannot establish communication using the TLS/SSL protocol. When this error occurs in Apigee Edge, the client application receives an HTTP status 503 with the message Service Unavailable.

What is a handshake failure?

An SSL Handshake Failure or Error 525 means that the server and browser were unable to establish a secure connection.


2 Answers

There are at least two problems with OnionProtocol:

  1. The innermost TLSMemoryBIOProtocol becomes the wrappedProtocol, when it should be the outermost;
  2. ProtocolWithoutConnectionLost does not pop any TLSMemoryBIOProtocols off OnionProtocol's stack, because connectionLost is only called after a FileDescriptors doRead or doWrite methods return a reason for disconnection.

We can't solve the first problem without changing the way OnionProtocol manages its stack, and we can't solve the second until we figure out the new stack implementation. Unsurprisingly, the correct design is a direct consequence of how data flows within Twisted, so we'll start with some data flow analysis.

Twisted represents an established connection with an instance of either twisted.internet.tcp.Server or twisted.internet.tcp.Client. Since the only interactivity in our program happens in stoptls_client, we'll only consider the data flow to and from a Client instance.

Let's warm up with a minimal LineReceiver client that echoes back lines received from a local server on port 9999:

from twisted.protocols import basic from twisted.internet import defer, endpoints, protocol, task  class LineReceiver(basic.LineReceiver):     def lineReceived(self, line):         self.sendLine(line)  def main(reactor):     clientEndpoint = endpoints.clientFromString(         reactor, "tcp:localhost:9999")     connected = clientEndpoint.connect(         protocol.ClientFactory.forProtocol(LineReceiver))     def waitForever(_):         return defer.Deferred()     return connected.addCallback(waitForever)  task.react(main) 

Once the established connection's established, a Client becomes our LineReceiver protocol's transport and mediates input and output:

Client and LineReceiver

New data from the server causes the reactor to call the Client's doRead method, which in turn passes what it's received to LineReceiver's dataReceived method. Finally, LineReceiver.dataReceived calls LineReceiver.lineReceived when at least one line is available.

Our application sends a line of data back to the server by calling LineReceiver.sendLine. This calls write on the transport bound to the protocol instance, which is the same Client instance that handled incoming data. Client.write arranges for the data to be sent by the reactor, while Client.doWrite actually sends the data over the socket.

We're ready to look at the behaviors of an OnionClient that never calls startTLS:

OnionClient without startTLS

OnionClients are wrapped in OnionProtocols, which are the crux of our attempt at nested TLS. As a subclass of twisted.internet.policies.ProtocolWrapper, an instance of OnionProtocol is a kind of protocol-transport sandwich; it presents itself as a protocol to a lower-level transport and as a transport to a protocol it wraps through a masquerade established at connection time by a WrappingFactory.

Now, Client.doRead calls OnionProtocol.dataReceived, which proxies the data through to OnionClient. As OnionClient's transport, OnionProtocol.write accepts lines to send from OnionClient.sendLine and proxies them down to Client, its own transport. This is the normal interaction between a ProtocolWrapper, its wrapped protocol, and its own transport, so naturally data flows to and from each without any trouble.

OnionProtocol.startTLS does something different. It attempts to interpose a new ProtocolWrapper — which happens to be a TLSMemoryBIOProtocol — between an established protocol-transport pair. This seems easy enough: a ProtocolWrapper stores the upper-level protocol as its wrappedProtocol attribute, and proxies write and other attributes down to its own transport. startTLS should be able to inject a new TLSMemoryBIOProtocol that wraps OnionClient into the connection by patching that instance over its own wrappedProtocol and transport:

def startTLS(self):     ...     connLost = ProtocolWithoutConnectionLost(self.wrappedProtocol)     connLost.onion = self     # Construct a new TLS layer, delivering events and application data to the     # wrapper just created.     tlsProtocol = TLSMemoryBIOProtocol(None, connLost, False)      # Push the previous transport and protocol onto the stack so they can be     # retrieved when this new TLS layer stops.     self._tlsStack.append((self.transport, self.wrappedProtocol))     ...     # Make the new TLS layer the current protocol and transport.     self.wrappedProtocol = self.transport = tlsProtocol 

Here's the flow of data after the first call to startTLS:

startTLS one TLSMemoryBIOProtocol, working

As expected, new data delivered to OnionProtocol.dataReceived is routed to the TLSMemoryBIOProtocol stored on the _tlsStack, which passes the decrypted plaintext to OnionClient.dataReceived. OnionClient.sendLine also passes its data to TLSMemoryBIOProtocol.write, which encrypts it and sends the resulting ciphertext to OnionProtocol.write and then Client.write.

Unfortunately this scheme fails after a second call to startTLS. The root cause is this line:

    self.wrappedProtocol = self.transport = tlsProtocol 

Each call to startTLS replaces the wrappedProtocol with the innermost TLSMemoryBIOProtocol, even though the data received by Client.doRead was encrypted by the outermost:

startTLS two TLSMemoryBIOProtocols, broken

The transports, however, are nested correctly. OnionClient.sendLine can only call its transport's write — that is, OnionProtocol.write — so OnionProtocol should replace its transport with the innermost TLSMemoryBIOProtocol to ensure writes are successively nested inside additional layers of encryption.

The solution, then, is to ensure that data flows through the first TLSMemoryBIOProtocol on the _tlsStack to the next one in turn, so that each layer of encryption is peeled off in the reverse order it was applied:

startTLS with two TLSMemoryBIOProtocols, working

Representing _tlsStack as a list seems less natural given this new requirement. Fortunately, representing the incoming data flow linearly suggests a new data structure:

Incoming data as a linked list traversal

Both the buggy and correct flow of incoming data resemble a singly-linked list, with wrappedProtocol serving as ProtocolWrappers next links and protocol serving as Client's. The list should grow downward from OnionProtocol and always end with OnionClient. The bug occurs because that ordering invariant is violated.

A singly-linked list is fine for pushing protocols onto the stack but awkward for popping them off, because it requires traversal downwards from its head to the node to remove. Of course, this traversal happens every time data's received, so the concern is the complexity implied by an additional traversal rather than worst-case time complexity. Fortunately, the list is actually doubly linked:

Doubly linked list with protocols and transports

The transport attribute links each nested protocol with its predecessor, so that transport.write can layer on successively lower levels of encryption before finally sending the data across the network. We have two sentinels to aid in managing the list: Client must always be at the top and OnionClient must always be at the bottom.

Putting the two together, we end up with this:

from twisted.python.components import proxyForInterface from twisted.internet.interfaces import ITCPTransport from twisted.protocols.tls import TLSMemoryBIOFactory, TLSMemoryBIOProtocol from twisted.protocols.policies import ProtocolWrapper, WrappingFactory   class PopOnDisconnectTransport(proxyForInterface(ITCPTransport)):     """     L{TLSMemoryBIOProtocol.loseConnection} shuts down the TLS session     and calls its own transport's C{loseConnection}.  A zero-length     read also calls the transport's C{loseConnection}.  This proxy     uses that behavior to invoke a C{pop} callback when a session has     ended.  The callback is invoked exactly once because     C{loseConnection} must be idempotent.     """     def __init__(self, pop, **kwargs):         super(PopOnDisconnectTransport, self).__init__(**kwargs)         self._pop = pop      def loseConnection(self):         self._pop()         self._pop = lambda: None   class OnionProtocol(ProtocolWrapper):     """     OnionProtocol is both a transport and a protocol.  As a protocol,     it can run over any other ITransport.  As a transport, it     implements stackable TLS.  That is, whatever application traffic     is generated by the protocol running on top of OnionProtocol can     be encapsulated in a TLS conversation.  Or, that TLS conversation     can be encapsulated in another TLS conversation.  Or **that** TLS     conversation can be encapsulated in yet *another* TLS     conversation.      Each layer of TLS can use different connection parameters, such as     keys, ciphers, certificate requirements, etc.  At the remote end     of this connection, each has to be decrypted separately, starting     at the outermost and working in.  OnionProtocol can do this     itself, of course, just as it can encrypt each layer starting with     the innermost.     """      def __init__(self, *args, **kwargs):         ProtocolWrapper.__init__(self, *args, **kwargs)         # The application level protocol is the sentinel at the tail         # of the linked list stack of protocol wrappers.  The stack         # begins at this sentinel.         self._tailProtocol = self._currentProtocol = self.wrappedProtocol       def startTLS(self, contextFactory, client, bytes=None):         """         Add a layer of TLS, with SSL parameters defined by the given         contextFactory.          If *client* is True, this side of the connection will be an         SSL client.  Otherwise it will be an SSL server.          If extra bytes which may be (or almost certainly are) part of         the SSL handshake were received by the protocol running on top         of OnionProtocol, they must be passed here as the **bytes**         parameter.         """         # The newest TLS session is spliced in between the previous         # and the application protocol at the tail end of the list.         tlsProtocol = TLSMemoryBIOProtocol(None, self._tailProtocol, False)         tlsProtocol.factory = TLSMemoryBIOFactory(contextFactory, client, None)          if self._currentProtocol is self._tailProtocol:             # This is the first and thus outermost TLS session.  The             # transport is the immutable sentinel that no startTLS or             # stopTLS call will move within the linked list stack.             # The wrappedProtocol will remain this outermost session             # until it's terminated.             self.wrappedProtocol = tlsProtocol             nextTransport = PopOnDisconnectTransport(                 original=self.transport,                 pop=self._pop             )             # Store the proxied transport as the list's head sentinel             # to enable an easy identity check in _pop.             self._headTransport = nextTransport         else:             # This a later TLS session within the stack.  The previous             # TLS session becomes its transport.             nextTransport = PopOnDisconnectTransport(                 original=self._currentProtocol,                 pop=self._pop             )          # Splice the new TLS session into the linked list stack.         # wrappedProtocol serves as the link, so the protocol at the         # current position takes our new TLS session as its         # wrappedProtocol.         self._currentProtocol.wrappedProtocol = tlsProtocol         # Move down one position in the linked list.         self._currentProtocol = tlsProtocol         # Expose the new, innermost TLS session as the transport to         # the application protocol.         self.transport = self._currentProtocol         # Connect the new TLS session to the previous transport.  The         # transport attribute also serves as the previous link.         tlsProtocol.makeConnection(nextTransport)          # Left over bytes are part of the latest handshake.  Pass them         # on to the innermost TLS session.         if bytes is not None:             tlsProtocol.dataReceived(bytes)       def stopTLS(self):         self.transport.loseConnection()       def _pop(self):         pop = self._currentProtocol         previous = pop.transport         # If the previous link is the head sentinel, we've run out of         # linked list.  Ensure that the application protocol, stored         # as the tail sentinel, becomes the wrappedProtocol, and the         # head sentinel, which is the underlying transport, becomes         # the transport.         if previous is self._headTransport:             self._currentProtocol = self.wrappedProtocol = self._tailProtocol             self.transport = previous         else:             # Splice out a protocol from the linked list stack.  The             # previous transport is a PopOnDisconnectTransport proxy,             # so first retrieve proxied object off its original             # attribute.             previousProtocol = previous.original             # The previous protocol's next link becomes the popped             # protocol's next link             previousProtocol.wrappedProtocol = pop.wrappedProtocol             # Move up one position in the linked list.             self._currentProtocol = previousProtocol             # Expose the new, innermost TLS session as the transport             # to the application protocol.             self.transport = self._currentProtocol    class OnionFactory(WrappingFactory):     """     A L{WrappingFactory} that overrides     L{WrappingFactory.registerProtocol} and     L{WrappingFactory.unregisterProtocol}.  These methods store in and     remove from a dictionary L{ProtocolWrapper} instances.  The     C{transport} patching done as part of the linked-list management     above causes the instances' hash to change, because the     C{__hash__} is proxied through to the wrapped transport.  They're     not essential to this program, so the easiest solution is to make     them do nothing.     """     protocol = OnionProtocol      def registerProtocol(self, protocol):         pass       def unregisterProtocol(self, protocol):         pass 

(This is also available on GitHub.)

The solution to the second problem lies in PopOnDisconnectTransport. The original code attempted to pop off a TLS session from the stack via connectionLost, but because only a closed file descriptor causes connectionLost to be called, it failed to remove stopped TLS sessions that didn't close the underlying socket.

At the time of this writing, TLSMemoryBIOProtocol calls its transport's loseConnection in exactly two places: _shutdownTLS and _tlsShutdownFinished. _shutdownTLS is called on active closes (loseConnection, abortConnection, unregisterProducer and after loseConnection and all pending writes have been flushed), while _tlsShutdownFinished is called on passive closes (handshake failures, empty reads, read errors, and write errors). This all means that both sides of a closed connection can pop stopped TLS sessions off the stack during loseConnection. PopOnDisconnectTransport does this idempotently because loseConnection is generally idempotent, and TLSMemoryBIOProtocol certainly expects it to be.

The downside to putting stack management logic in loseConnection is that it depends on the particulars of TLSMemoryBIOProtocol's implementation. A generalized solution would require new APIs across many levels of Twisted.

Until then, we're stuck with another example of Hyrum's Law.

like image 141
Mark Williams Avatar answered Oct 20 '22 17:10

Mark Williams


You may need to inform the remote device that you wish to start an environment and allocate resources for the second layer before you start it up, if that device has the capabilities.

like image 37
developer Avatar answered Oct 20 '22 19:10

developer