select on fds higher then 255 do not check if the fd is open. Here is my example code:
#include <stdio.h>
#include <errno.h>
#include <unistd.h>
#include <sys/select.h>
int main()
{
fd_set set;
for(int i = 5;i<FD_SETSIZE;i++)
{
printf("--> i is %d\n", i);
FD_ZERO(&set);
FD_SET(i, &set);
close(i);
int retval = select(FD_SETSIZE, &set, NULL, NULL, NULL);
if(-1 == retval)
{
perror("select");
}
}
}
This results in:
--> i is 5
select: Bad file descriptor
...
--> i is 255
select: Bad file descriptor
--> i is 256
Then the application blocks.
Why does this not create a EBADF
on 256 till FD_SETSIZE?
Requested Information from comments:
The result of prlimit
is:
NOFILE max number of open files 1024 1048576
This is the result of strace ./test_select
:
select(1024, [127], NULL, NULL, NULL) = -1 EBADF (Bad file descriptor)
dup(2) = 3
fcntl(3, F_GETFL) = 0x8402 (flags O_RDWR|O_APPEND|O_LARGEFILE)
fstat(3, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 2), ...}) = 0
write(3, "select: Bad file descriptor\n", 28select: Bad file descriptor
) = 28
close(3) = 0
write(1, "--> i is 128\n", 13--> i is 128
) = 13
close(128) = -1 EBADF (Bad file descriptor)
select(1024, [128], NULL, NULL, NULL
Debunking thoughts from the comments:
#include <stdio.h>
#include <errno.h>
#include <unistd.h>
#include <sys/select.h>
#include <fcntl.h>
int main()
{
char filename[80];
int fd;
for(int i = 5;i<500;i++)
{
snprintf(filename, 80, "/tmp/file%d", i);
fd = open(filename, O_RDWR | O_APPEND | O_CREAT);
}
printf("--> fd is %d, FD_SETSIZE is %d\n", fd, FD_SETSIZE);
fd_set set;
FD_ZERO(&set);
FD_SET(fd, &set);
int retval = select(FD_SETSIZE, NULL, &set, NULL, NULL);
if(-1 == retval)
{
perror("select");
}
}
Results in:
$ ./test_select
--> fd is 523, FD_SETSIZE is 1024
Process exits normally, no blocking.
The select function returns the total number of socket handles that are ready and contained in the fd_set structures, zero if the time limit expired, or SOCKET_ERROR if an error occurred. If the return value is SOCKET_ERROR, WSAGetLastError can be used to retrieve a specific error code.
The select() function tests file descriptors in the range of 0 to nfds-1. If the readfds argument is not a null pointer, it points to an object of type fd_set that on input specifies the file descriptors to be checked for being ready to read, and on output indicates which file descriptors are ready to read.
On success, select() and pselect() return the number of file descriptors contained in the three returned descriptor sets (that is, the total number of bits that are set in readfds, writefds, exceptfds).
The fd_set structure is used by various Windows Sockets functions and service providers, such as the select function, to place sockets into a "set" for various purposes, such as testing a given socket for readability using the readfds parameter of the select function.
Something very strange is going on here. You may have found a bug in the Linux kernel.
I modified your test program to make it more precise and also to not get stuck when it hits the problem:
#include <stdio.h>
#include <errno.h>
#include <string.h>
#include <unistd.h>
#include <sys/select.h>
#include <sys/time.h>
int main(void)
{
fd_set set;
struct timeval tv;
int i;
for(i = 5; i < FD_SETSIZE; i++)
{
FD_ZERO(&set);
FD_SET(i, &set);
tv.tv_sec = 0;
tv.tv_usec = 1000;
close(i);
int retval = select(FD_SETSIZE, &set, 0, 0, &tv);
if (retval == -1 && errno == EBADF)
;
else
{
if (retval > 0)
printf("fd %d: select returned success (%d)\n", i, retval);
else if (retval == 0)
printf("fd %d: select timed out\n", i);
else
printf("fd %d: select failed (%d; %s)\n", i, retval, strerror(errno));
return 1;
}
}
return 0;
}
My understanding of POSIX says that, whatever FD_SETSIZE
is, this program should produce no output and exit successfully. And that is what it does on FreeBSD 11.1 and NetBSD 7.1 (both running on x86 processors of some description). But on Linux (x86-64, kernel 4.13), it prints
fd 256: select timed out
and exits unsuccessfully. Even stranger, if I run the same binary under strace
, that changes the output:
$ strace -o /dev/null ./a.out
fd 64: select timed out
The same thing happens if I run it under gdb
, even if I don't tell gdb
to do anything other than just run the program.
Reading symbols from ./a.out...done.
(gdb) r
Starting program: /tmp/a.out
fd 64: select timed out
[Inferior 1 (process 8209) exited with code 01]
So something is changing just because the process is subject to ptrace
monitoring. That can only be caused by the kernel.
I have filed a bug report on the Linux kernel and will report what they say about it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With