Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does client.recv(1024) return an empty byte literal in this bare-bones WebSocket Server implementation?

I need a web socket client server exchange between Python and JavaScript on an air-gapped network, so I'm limited to what I can read and type up (believe me I'd love to be able to run pip install websockets). Here's a bare-bones RFC 6455 WebSocket client-server relationship between Python and JavaScript. Below the code, I'll pinpoint a specific issue with client.recv(1024) returning an empty byte literal, causing the WebSocket Server implementation to abort the connection.

Client:

<script>
    const message = { 
        name: "ping",
        data: 0
    }
    const socket = new WebSocket("ws://localhost:8000")
    socket.addEventListener("open", (event) => {
        console.log("socket connected to server")
        socket.send(JSON.stringify(message))
    })
    socket.addEventListener("message", (event) => {
        console.log("message from socket server:", JSON.parse(event))
    })
</script>

Server, found here (minimal implementation of RFC 6455):

import array
import time
import socket
import hashlib
import sys
from select import select
import re
import logging
from threading import Thread
import signal
from base64 import b64encode

class WebSocket(object):
    handshake = (
        "HTTP/1.1 101 Web Socket Protocol Handshake\r\n"
        "Upgrade: WebSocket\r\n"
        "Connection: Upgrade\r\n"
        "WebSocket-Origin: %(origin)s\r\n"
        "WebSocket-Location: ws://%(bind)s:%(port)s/\r\n"
        "Sec-Websocket-Accept: %(accept)s\r\n"
        "Sec-Websocket-Origin: %(origin)s\r\n"
        "Sec-Websocket-Location: ws://%(bind)s:%(port)s/\r\n"
        "\r\n"
    )
    def __init__(self, client, server):
        self.client = client
        self.server = server
        self.handshaken = False
        self.header = ""
        self.data = ""

    def feed(self, data):
        if not self.handshaken:
            self.header += str(data)
            if self.header.find('\\r\\n\\r\\n') != -1:
                parts = self.header.split('\\r\\n\\r\\n', 1)
                self.header = parts[0]
                if self.dohandshake(self.header, parts[1]):
                    logging.info("Handshake successful")
                    self.handshaken = True
        else:
            self.data += data.decode("utf-8", "ignore")
            playloadData = data[6:]
            mask = data[2:6]
            unmasked = array.array("B", playloadData)
            for i in range(len(playloadData)):
                unmasked[i] = unmasked[i] ^ mask[i % 4]
            self.onmessage(bytes(unmasked).decode("utf-8", "ignore"))

    def dohandshake(self, header, key=None):
        logging.debug("Begin handshake: %s" % header)
        digitRe = re.compile(r'[^0-9]')
        spacesRe = re.compile(r'\s')
        part = part_1 = part_2 = origin = None
        for line in header.split('\\r\\n')[1:]:
            name, value = line.split(': ', 1)
            if name.lower() == "sec-websocket-key1":
                key_number_1 = int(digitRe.sub('', value))
                spaces_1 = len(spacesRe.findall(value))
                if spaces_1 == 0:
                    return False
                if key_number_1 % spaces_1 != 0:
                    return False
                part_1 = key_number_1 / spaces_1
            elif name.lower() == "sec-websocket-key2":
                key_number_2 = int(digitRe.sub('', value))
                spaces_2 = len(spacesRe.findall(value))
                if spaces_2 == 0:
                    return False
                if key_number_2 % spaces_2 != 0:
                    return False
                part_2 = key_number_2 / spaces_2
            elif name.lower() == "sec-websocket-key":
                part = bytes(value, 'UTF-8')
            elif name.lower() == "origin":
                origin = value
        if part:
            sha1 = hashlib.sha1()
            sha1.update(part)
            sha1.update("258EAFA5-E914-47DA-95CA-C5AB0DC85B11".encode('utf-8'))
            accept = (b64encode(sha1.digest())).decode("utf-8", "ignore")
            handshake = WebSocket.handshake % {
                'accept': accept,
                'origin': origin,
                'port': self.server.port,
                'bind': self.server.bind
            }
            #handshake += response
        else:
            logging.warning("Not using challenge + response")
            handshake = WebSocket.handshake % {
                'origin': origin,
                'port': self.server.port,
                'bind': self.server.bind
            }
        logging.debug("Sending handshake %s" % handshake)
        self.client.send(bytes(handshake, 'UTF-8'))
        return True

    def onmessage(self, data):
        logging.info("Got message: %s" % data)

    def send(self, data):
        logging.info("Sent message: %s" % data)
        self.client.send("\x00%s\xff" % data)

    def close(self):
        self.client.close()

class WebSocketServer(object):
    def __init__(self, bind, port, cls):
        self.socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        self.socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
        self.socket.bind((bind, port))
        self.bind = bind
        self.port = port
        self.cls = cls
        self.connections = {}
        self.listeners = [self.socket]

    def listen(self, backlog=5):
        self.socket.listen(backlog)
        logging.info("Listening on %s" % self.port)
        self.running = True
        while self.running:
            # upon first connection rList = [784] and the other two are empty
            rList, wList, xList = select(self.listeners, [], self.listeners, 1)
            for ready in rList:
                if ready == self.socket:
                    logging.debug("New client connection")
                    client, address = self.socket.accept()
                    fileno = client.fileno()
                    self.listeners.append(fileno)
                    self.connections[fileno] = self.cls(client, self)
                else:
                    logging.debug("Client ready for reading %s" % ready)
                    client = self.connections[ready].client
                    data = client.recv(1024) # currently, this results in: b''
                    fileno = client.fileno()
                    if data: # data = b''
                        self.connections[fileno].feed(data)
                    else:
                        logging.debug("Closing client %s" % ready)
                        self.connections[fileno].close()
                        del self.connections[fileno]
                        self.listeners.remove(ready)
            for failed in xList:
                if failed == self.socket:
                    logging.error("Socket broke")
                    for fileno, conn in self.connections:
                        conn.close()
                    self.running = False

if __name__ == "__main__":
    logging.basicConfig(level=logging.DEBUG, 
        format="%(asctime)s - %(levelname)s - %(message)s")
    server = WebSocketServer("localhost", 8000, WebSocket)
    server_thread = Thread(target=server.listen, args=[5])
    server_thread.start()
    # Add SIGINT handler for killing the threads
    def signal_handler(signal, frame):
        logging.info("Caught Ctrl+C, shutting down...")
        server.running = False
        sys.exit()
    signal.signal(signal.SIGINT, signal_handler)
    while True:
        time.sleep(100)

server side logs:

INFO - Hanshake successful
DEBUG - Client ready for reading 664
DEBUG - Closing client 664

and on the client side I get

WebSocket connection to 'ws://localhost:8000' failed: Unknown Reason

The problem is traced here:

if data:
    self.connections[fileno].feed(data)
else: # this is being triggered on the server side 
    logging.debug("Closing client %s" % ready)

So researching this I found a potential problem in the Python documentation for select used to retrieve rlist, wlist, xlist

select.select(rlist, wlist, xlist[, timeout]) This is a straightforward interface to the Unix select() system call. The first three arguments are iterables of ‘waitable objects’: either integers representing file descriptors or objects with a parameterless method named fileno() returning such an integer:

rlist: wait until ready for reading

wlist: wait until ready for writing

xlist: wait for an “exceptional condition” (see the manual page for what your system considers such a condition)

Seeing that the feature is based on the Unix system call, I realized this code might not support Windows, which is my environment. I checked the values of rlist, wlist, xlist and found they're all empty lists on the first iteration rList = [784] (or another number, such as 664) and the other two are empty, after which the connection is closed.

The documentation goes on to note:

Note: File objects on Windows are not acceptable, but sockets are. On Windows, the underlying select() function is provided by the WinSock library, and does not handle file descriptors that don’t originate from WinSock.

But I'm not clear on the exact meaning of this.

So in the code logic, I did some logging and traced the issue here:

rList, wList, xList = select(self.listeners, [], self.listeners, 1)
    for ready in rList: # rList = [836] or some other number
        # and then we check if ready (so the 836 int) == self.socket
        # but if we log self.socket we get this:
        # <socket.socket fd=772, family=AddressFamily.AF_INET, 
        # type=SocketKind.SOCK_STREAM, proto=0, laddr=('127.0.0.1', 8000)>
        # so of course an integer isn't going to be equivalent to that
        if ready == self.socket:
            logging.debug("New client connection")
            #so lets skip this code and see what the other condition does
        else:
            logging.debug("Client ready for reading %s" % ready)
            client = self.connections[ready].client
            data = client.recv(1024) # currently, this results in: b''
            fileno = client.fileno()
            if data: # data = b'', so this is handled as falsy
                self.connections[fileno].feed(data)
            else:
                logging.debug("Closing client %s" % ready)
            

And as to why client.recv(1024) returns an empty binary string, I have no idea. I don't know if rList was supposed to contain more than an integer, or if the protocol is working as intended up until recv

Can anyone explain what's causing the broken .recv call here? Is the client side JavaScript WebSocket protocol not sending whatever data should be expected? Or is the WebSocket Server at fault, and what's wrong with it?

like image 659
J.Todd Avatar asked Sep 16 '20 12:09

J.Todd


People also ask

What is S RECV 1024?

recv(1024) This means our socket is going to attempt to receive data, in a buffer size of 1024 bytes at a time.

Would WebSockets be able to handle 1000000 concurrent connections?

With at least 30 GiB RAM you can handle 1 million concurrent sockets.

What does Conn RECV return?

If successful, recv() returns the length of the message or datagram in bytes. The value 0 indicates the connection is closed. If unsuccessful, recv() returns -1 and sets errno to one of the following values: Error Code.

How does socket recv work?

The recv function is used to read incoming data on connection-oriented sockets, or connectionless sockets. When using a connection-oriented protocol, the sockets must be connected before calling recv. When using a connectionless protocol, the sockets must be bound before calling recv.


1 Answers

I tried running your example and it seem to be working as expected. At least server logs end with the following line:

INFO - Got message: {"name":"ping","data":0}

My environment:

  • OS: Arch Linux;
  • WebSocket client: Chromium/85.0.4183.121 running the JS-code you provided;
  • WebSocket server: Python/3.8.5 running the Python code you provided;

select.select docstring indeed states that

On Windows, only sockets are supported

but most likely the OS is irrelevant since the server code uses only sockets as select.select arguments.

recv returns an empty byte string when the reading end of a socket is closed. From recv(3) man:

If no messages are available to be received and the peer has performed an orderly shutdown, recv() shall return 0.

An interesting thing is a message about a successful handshake in server logs you got:

INFO - Hanshake successful

It means that in your case the connection between the client and the server has been established and some data has flown in both directions. After that the socket got closed. Looking at the server code I see no reason for the server to stop the connection. So I assume that the client you are using is to blame.

To find out exactly what is going wrong, try intercepting the network traffic using tcpdump or wireshark and running the following Python WebSocket client script that reproduces the actions my browser did when I was testing:

import socket

SERVER = ("localhost", 8000)
HANDSHAKE = (
    b"GET /chat HTTP/1.1\r\n"
    b"Host: server.example.com\r\n"
    b"Upgrade: websocket\r\n"
    b"Connection: Upgrade\r\n"
    b"Sec-WebSocket-Key: x3JJHMbDL1EzLkh9GBhXDw==\r\n"
    b"Sec-WebSocket-Protocol: chat, superchat\r\n"
    b"Sec-WebSocket-Version: 13\r\n"
    b"Origin: http://example.com\r\n"
    b"\r\n\r\n"
)
# a frame with `{"name":"ping","data":0}` payload
MESSAGE = b"\x81\x983\x81\xde\x04H\xa3\xb0e^\xe4\xfc>\x11\xf1\xb7jT\xa3\xf2&W\xe0\xaae\x11\xbb\xeey"

with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
    s.connect(SERVER)

    n = s.send(HANDSHAKE)
    assert n != 0

    data = s.recv(1024)
    print(data.decode())

    n = s.send(MESSAGE)
    assert n != 0
like image 72
rnovatorov Avatar answered Sep 19 '22 09:09

rnovatorov