Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

poll() in Ruby?

I am currently porting a self-written network application from C++ to Ruby. This network application often needs to manage around 10.000 sockets at the same time, that means it needs quick access to any sockets that have readable data available, incoming connections, etc.

I have already experienced while writing it in C++, that select() does not work for this case, because internally it uses 32 DWORDs (128 byte) to manage maximally 1024 sockets with bitmasks. Since I sometimes have to work with more than 10.000 sockets, this function did not suffice. Thus I had to switch to poll() which also made the code more elegant because I did not always have add and remove all the file-descriptors again.

As I can see from the Ruby documentation, Ruby offers IO.select(), which would basically be a wrapper for the C-API (as far as I know). Unfortunately it seems like there is no IO.poll(), which I would need for this particular application.

Does IO.select() have the same limitations as select() on WinSocks and Berkeley Sockets? If yes, is there a way to work around that?

like image 265
Patrick Glandien Avatar asked Dec 30 '25 07:12

Patrick Glandien


2 Answers

Select cannot safely be used with programs that have more than 1024 file descriptors on a Linux system. This is because the underlying fd_set that the select system call uses is a fixed sized buffer i.e. its size is allocated at compile time, not run time.

From man 2 select:

int select(int nfds, fd_set *readfds, fd_set *writefds,
           fd_set *exceptfds, struct timeval *timeout);

An fd_set is a fixed size buffer. Executing FD_CLR() or FD_SET() with a value of fd that is negative or is equal to or larger than FD_SETSIZE will result in undefined behavior. Moreover, POSIX requires fd to be a valid file descriptor.

This means that if you have more than 1024 file descriptors in your program, and you use the select system call, you will end up with memory corruption.

If you want to use more than 1024 file descriptors in your program, you must use poll or epoll, and ensure that you never use select, or you will get random memory corruption. Changing the size of the file descriptor table through ulimit is very dangerous if you are using select. Don't do it.

Ruby's select does seem to be actually implemented with the select system call, so while it may look like increasing ulimit works, under the hood corruption is happening: https://github.com/ruby/ruby/blob/trunk/thread.c

Furthermore, some unrelated API's in ruby seem to use select (see thread_pthread.c) so it's probably also unsafe to use those, or any code that uses those API's within a ruby program running with a file descriptor table larger than 1024.

like image 105
catphive Avatar answered Jan 01 '26 23:01

catphive


The limitations on IO.select() and in fact the number of open connections you can have per process appear to be determined primarily by the underlying operating system support. Definitely no fixed 1024 socket limit.

For example, under WinXP, I max out at 69 socket opens (even before I get to select). I'm sure that is probably tunable, I just don't know how.

Under Linux, the limitation is the number of open files allowed. By default, the limit is usually 1024 (run ulimit -a to check).

However, you can easily change this e.g. ulimit -n 10000. I just ran a test and happily went well over 1024 active sockets created with TCPSocket.new, and using IO.select to test for ready data.

NB: there is a good example of IO.select usage in this GServer article.

like image 37
tardate Avatar answered Jan 01 '26 22:01

tardate



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!