I've started with Haskell a while ago and now I'm focusing on networking. I followed some tutorials and source samples to put together a very simple echo server:
main = withSocketsDo $ do
forkIO $ acceptor 8080
print "Server running ... " >> getLine >>= print
tcpSock :: IO Socket
tcpSock = socket AF_INET Stream 0
acceptor :: PortNumber -> IO ()
acceptor port = do
-- Setup server socket
sock <- tcpSock
setSocketOption sock ReuseAddr 1
bindSocket sock (SockAddrInet port iNADDR_ANY)
listen sock 50
-- Start with zero index
loop sock 0
where
loop sock sockId = do
-- Accept socket
(nextSock, addr) <- accept sock
-- Setup the socket for performance
(_, handle) <- setupClient nextSock
-- Run client in own thread
forkIO $ do
-- Get a stream of bytes
stream <- BS.hGetContents handle
-- Echo the first received char
BS.hPut handle $ BS.take 1 stream
-- Kill the socket
SIO.hClose handle
-- Accept next client
loop sock (sockId + 1)
setupClient :: Socket -> IO (Socket, SIO.Handle)
setupClient sock = do
-- Disable nagle
setSocketOption sock NoDelay 1
-- Disable buffering
hdl <- socketToHandle sock SIO.ReadWriteMode
SIO.hSetBuffering hdl SIO.NoBuffering
return (sock, hdl)
Now, I've tested the code with the ab-Tool to benchmark the server. The code is compiled using -O2 and -threaded and the program is started using +RTS -N to use multiple OS threads.
The code creates a new lightweight thread per client and as far as I remember are these threads pretty cheap because they are scheduled by a bunch of real OS threads.
After running the tool, the results are:
ab -n 10000 -c 1000 http://localhost:8080/
~ 500 - 1600 req/sec
Yes, it does change sometimes between 500 and 1600!
At first I thought well, not bad. Then I ran the program without "+RTS -N" and results are almost every time ~20000 req/sec.
Obviously the threading kills the performance pretty badly, but why ? My guess is, that the IO manager does a pretty bad job when dealing with a lot of connections.
BTW: I use Ubuntu 13.04 and ghc 7.6, but I've tested the code under Windows 8 which gave me far worse results, but I think the IO manager is tuned for linux, which makes sense.
Am I doing something reallly stupid here ?? I know, the example is quite trivial but here is obviously something going wrong.
Regards, Chris
Okay, I think I semi-solved the problem, though I'm still not sure where the error is/was.
I'm now using the Network package so the accept routine is handle-based. I tried this because I noticed a memory leak after a couple of tests.
This way I solved magially two problem at once, because now the threading makes NO difference. I really dont know why this is happening, but the handle-based impl. is simpler and obviously faster/more safe.
Maybe this helps other people experiencing the same problem.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With