I'm trying to understand and recreate a simple preforking server along the lines of unicorn where the server on start forks 4 processes which all wait (to accept) on the controlling socket.
The controlling socket @control_socket
binds to 9799 and spawns 4 workers which wait to accept a connection. The work done on each worker is as follows
def spawn_child
fork do
$STDOUT.puts "Forking child #{Process.pid}"
loop do
@client = @control_socket.accept
loop do
request = gets
if request
respond(@inner_app.call(request))
else
$STDOUT.puts("No Request")
@client.close
end
end
end
end
end
I've used a very simple rack app which simply returns a string with the status code 200 and a Content-Type of text/html.
The problem i face is that my server works as it should when i read incoming requests (by hitting the url at "http://localhost:9799") using a gets
instead of something like read
or read_partial
or read_nonblock
. When I use non blocking reads it never seems to throw the EOFError, which according to my understanding means it does not receive the EOF
state.
This causes the read loop
to not complete. Here is the code snippet which does this bit of work.
# Reads a file using IO.read_nonblock
# Returns end of file when using get but doesn't seem to return
# while using read_nonblock or readpartial
# The fact that the method is named gets is just bad naming, please ignore
def gets
buffer = ""
i =0
loop do
puts "loop #{i}"
i += 1
begin
buffer << @client.read_nonblock(READ_CHUNK)
puts "buffer is #{buffer}"
rescue Errno::EAGAIN => e
puts "#{e.message}"
puts "#{e.backtrace}"
IO.select([@client])
retry
rescue EOFError
$STDOUT.puts "-" * 50
puts "request data is #{buffer}"
$STDOUT.puts "-" * 50
break
end
end
puts "returning buffer"
buffer
end
However the code works perfectly if the I use a simple gets
instead of read
or read_nonblock
or if replace the IO.select([@client])
with a break
.
Here is when the code works and returns the response. The reason I intend to use read_nonblock is unicorn uses an equivalent using the kgio library which implements a non_blocking read.
def gets
@client.gets
end
The entire code is pasted next.
module Server
class Prefork
# line break
CRLF = "\r\n"
# number of workers process to fork
CONCURRENCY = 4
# size of each non_blocking read
READ_CHUNK = 1024
$STDOUT = STDOUT
$STDOUT.sync
# creates a control socket which listens to port 9799
def initialize(port = 21)
@control_socket = TCPServer.new(9799)
puts "Starting server..."
trap(:INT) {
exit
}
end
# Reads a file using IO.read_nonblock
# Returns end of file when using get but doesn't seem to return
# while using read_nonblock or readpartial
def gets
buffer = ""
i =0
loop do
puts "loop #{i}"
i += 1
begin
buffer << @client.read_nonblock(READ_CHUNK)
puts "buffer is #{buffer}"
rescue Errno::EAGAIN => e
puts "#{e.message}"
puts "#{e.backtrace}"
IO.select([@client])
retry
rescue EOFError
$STDOUT.puts "-" * 50
puts "request data is #{buffer}"
$STDOUT.puts "-" * 50
break
end
end
puts "returning buffer"
buffer
end
# responds with the data and closes the connection
def respond(data)
puts "request 2 Data is #{data.inspect}"
status, headers, body = data
puts "message is #{body}"
buffer = "HTTP/1.1 #{status}\r\n" \
"Date: #{Time.now.utc}\r\n" \
"Status: #{status}\r\n" \
"Connection: close\r\n"
headers.each {|key, value| buffer << "#{key}: #{value}\r\n"}
@client.write(buffer << CRLF)
body.each {|chunk| @client.write(chunk)}
ensure
$STDOUT.puts "*" * 50
$STDOUT.puts "Closing..."
@client.respond_to?(:close) and @client.close
end
# The main method which triggers the creation of workers processes
# The workers processes all wait to accept the socket on the same
# control socket allowing the kernel to do the load balancing.
#
# Working with a dummy rack app which returns a simple text message
# hence the config.ru file read.
def run
# copied from unicorn-4.2.1
# refer unicorn.rb and lib/unicorn/http_server.rb
raw_data = File.read("config.ru")
app = "::Rack::Builder.new {\n#{raw_data}\n}.to_app"
@inner_app = eval(app, TOPLEVEL_BINDING)
child_pids = []
CONCURRENCY.times do
child_pids << spawn_child
end
trap(:INT) {
child_pids.each do |cpid|
begin
Process.kill(:INT, cpid)
rescue Errno::ESRCH
end
end
exit
}
loop do
pid = Process.wait
puts "Process quit unexpectedly #{pid}"
child_pids.delete(pid)
child_pids << spawn_child
end
end
# This is where the real work is done.
def spawn_child
fork do
$STDOUT.puts "Forking child #{Process.pid}"
loop do
@client = @control_socket.accept
loop do
request = gets
if request
respond(@inner_app.call(request))
else
$STDOUT.puts("No Request")
@client.close
end
end
end
end
end
end
end
p = Server::Prefork.new(9799)
p.run
Could somebody explain to me why the reads fail with 'read_partial' or 'read_nonblock' or 'read'. I would really appreciate some help on this.
Thanks.
First i wanna talk about some basic knowledge, EOF means end of file, it's like signal will send to caller when there is no more data can be read from data source, for example, open a File and after read the entire file will receives an EOF, or just simple close the io stream.
Then there are several differences between these 4 methods
gets
reads a line from stream, in ruby it uses $/
as the default line delimiter, but you can pass a parameter as line delimiter, because if the client and server are not the same operating system, the line delimiter maybe different, it's a block method, if never meet a line delimiter or EOF it will block, and returns nil when receives an EOF, so gets
will never meet an EOFError
.
read(length)
reads length bytes from stream, it's a block method, if length is omitted then it will block until read EOF, if there is a length then it returns only once has read certain amount of data or meet EOF, and returns empty string when receives an EOF, so read
will never meet an EOFError
.
readpartial(maxlen)
reads at most maxlen bytes from stream, it will read available data and return immediately, it's kind like a eager version of read
, if the data is too large you can use readpartial
instead of read
to prevent from blocking, but it's still a block method, it blocks if no data available immediately, readpartial
will raises an EOFError
if receives an EOF.
read_nonblock(maxlen)
is kind like readpartial
, but like the name said it's a nonblock method, even no data available it raise an Errno::EAGAIN
immediately it means no data right now, you should care about this error, normally in Errno::EAGAIN
rescue clause should call IO.select([conn])
first for less unnecessary cycle, it will block until the conn becomes available to read, then retry
, read_nonblock
will raises an EOFError
if receives an EOF.
Now let's see your example, as i see what you are doing is try to read data by "hitting the url" first, it's just a HTTP GET request, some text like "GET / HTTP/1.1\r\n", connection are keep alive in HTTP/1.1 by default, so using readpartial
or read_nonblock
will never receive an EOF, unless put Connection: close
header in your request, or change your gets method as below:
buffer = ""
if m = @client.gets
buffer << m
break if m.strip == ""
else
break
end
buffer
You can't use read
here, because you don't know the exact length of the request package, use large length or just simply omitted will cause block.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With