Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Simple HTTP server in Ruby using TCPServer

For a school assignment, I am trying to create a simple HTTP server using Ruby and the sockets library.

Right now, I can get it to respond to any connection with a simple hello:

require 'socket'

server = TCPServer.open 2000
puts "Listening on port 2000"

loop {
  client = server.accept()
  resp = "Hello?"
  headers = ["HTTP/1.1 200 OK",
             "Date: Tue, 14 Dec 2010 10:48:45 GMT",
             "Server: Ruby",
             "Content-Type: text/html; charset=iso-8859-1",
             "Content-Length: #{resp.length}\r\n\r\n"].join("\r\n")
  client.puts headers
  client.puts resp
  client.close
}

This works as expected. However, when I have the server tell me who just connected with

puts "Client: #{client.addr[2]}"

and use Chromium (browser) to connect to localhost:2000/ (just once), I get:

Client: 127.0.0.1
Client: 127.0.0.1
Client: 127.0.0.1
Client: 127.0.0.1

I assume this is Chromium requesting auxiliary files, like favicon.ico, and not my script doing something weird, so I wanted to investigate the incoming request. I replaced the resp = "Hello?" line with

resp = client.read()

And restarted the server. I resent the request in Chromium, and instead of it coming back right away, it just hung. Meanwhile, I got the output Client: 127.0.0.1 in my server output. I hit the "stop" button in Chromium, and then the server crashed with

server.rb:16:in `write': Broken pipe (Errno::EPIPE)
    from server.rb:16:in `puts'
    from server.rb:16:in `block in <main>'
    from server.rb:6:in `loop'
    from server.rb:6:in `<main>'

Obviously, I'm doing something wrong, as the expected behavior was sending the incoming request back as the response.

What am I missing?

like image 558
Austin Hyde Avatar asked Sep 24 '11 15:09

Austin Hyde


1 Answers

I don't really know about chrome and the four connections, but I'll try to answer your questions on how to read the request properly.

First of all, IO#read won't work in this case. According to the documentation, read without any parameters reads until it encounters EOF, but nothing like that happens. A socket is an endless stream, you won't be able to use that method in order to read in the entire message, since there is no "entire" message for the socket. You could use read with an integer, like read(100) or something, but that will block at some point anyway.

Basically, reading a socket is very different from reading a file. A socket is updated asynchronously, completely independent of the time you try to read it. If you request 10 bytes, it's possible that, at this point in the code, only 5 bytes are available. With blocking IO, the read(10) call will then hang and wait until 5 more bytes are available, or until the connection is closed. This means that, if you try repeatedly reading packets of 10 bytes, at some point, it will still hang. Another way to read a socket is using non-blocking IO, but that's not very important in your case, and it's a long topic by itself.

So here's an example of how you might access the data by using blocking IO:

loop {
  client = server.accept

  while line = client.gets
    puts line.chomp
    break if line =~ /^\s*$/
  end

  # rest of loop ...
}

The gets method tries to read from the socket until it encounters a newline. This will happen at some point for an HTTP request, so even if the entire message is transferred piece by piece, gets should return a single line from the output. The line.chomp call will cut off the final newlines if they're present. If the line read is empty, that means the HTTP headers have been transferred and we can safely break the loop (you can put that in the while condition, of course). The request will be dumped to the console that the server has been started on. If you really want to send it back to the browser, the idea's the same, you just need to handle the lines differently:

loop {
  client = server.accept

  lines = []
  while line = client.gets and line !~ /^\s*$/
    lines << line.chomp
  end

  resp = lines.join("<br />")
  headers = ["http/1.1 200 ok",
            "date: tue, 14 dec 2010 10:48:45 gmt",
            "server: ruby",
            "content-type: text/html; charset=iso-8859-1",
            "content-length: #{resp.length}\r\n\r\n"].join("\r\n")
  client.puts headers          # send the time to the client
  client.puts resp
  client.close
}

As for the broken pipe, that error occurs because the browser forcefully breaks the connection off while read is trying to access data.

like image 87
Andrew Radev Avatar answered Sep 30 '22 20:09

Andrew Radev