Are there any platforms where using structure copy on an fd_set (for select() or pselect()) causes problems?

Tags:

The select() and pselect() system calls modify their arguments (the 'fd_set *' arguments), so the input value tells the system which file descriptors to check and the return values tell the programmer which file descriptors are currently usable.

If you are going to call them repeatedly for the same set of file descriptors, you need to ensure that you have a fresh copy of the descriptors for each call. The obvious way to do that is to use a structure copy:

fd_set ref_set_rd;
fd_set ref_set_wr;
fd_set ref_set_er;
...
...code to set the reference fd_set_xx values...
...
while (!done)
{
    fd_set act_set_rd = ref_set_rd;
    fd_set act_set_wr = ref_set_wr;
    fd_set act_set_er = ref_set_er;
    int bits_set = select(max_fd, &act_set_rd, &act_set_wr,
                          &act_set_er, &timeout);
    if (bits_set > 0)
    {
        ...process the output values of act_set_xx...
    }
 }

(Edited to remove incorrect struct fd_set references - as pointed out by 'R..'.)

My question:

Are there any platforms where it is not safe to do a structure copy of the fd_set values as shown?

I'm concerned lest there be hidden memory allocation or anything unexpected like that. (There are macros/functions FD_SET(), FD_CLR(), FD_ZERO() and FD_ISSET() to mask the internals from the application.)

I can see that MacOS X (Darwin) is safe; other BSD-based systems are likely to be safe, therefore. You can help by documenting other systems that you know are safe in your answers.

(I do have minor concerns about how well the fd_set would work with more than 8192 open file descriptors - the default maximum number of open files is only 256, but the maximum number is 'unlimited'. Also, since the structures are 1 KB, the copying code is not dreadfully efficient, but then running through a list of file descriptors to recreate the input mask on each cycle is not necessarily efficient either. Maybe you can't do select() when you have that many file descriptors open, though that is when you are most likely to need the functionality.)

There's a related SO question - asking about 'poll() vs select()' which addresses a different set of issues from this question.

Note that on MacOS X - and presumably BSD more generally - there is an FD_COPY() macro or function, with the effective prototype:

extern void FD_COPY(const restrict fd_set *from, restrict fd_set *to);.

It might be worth emulating on platforms where it is not already available.

235

asked Mar 11 '10 00:03

Jonathan Leffler

3 Answers

Since struct fd_set is just a regular C structure, that should always be fine. I personally don't like doing structure copying via the = operator, since I've worked on plenty of platforms that didn't have access to the normal set of compiler intrinsics. Using memcpy() explicitly rather than having the compiler insert a function call is a better way to go, in my book.

From the C spec, section 6.5.16.1 Simple assignment (edited here for brevity):

One of the following shall hold:

...

the left operand has a qualified or unqualified version of a structure or union type compatible with the type of the right;

...

In simple assignment (=), the value of the right operand is converted to the type of the assignment expression and replaces the value stored in the object designated by the left operand.

If the value being stored in an object is read from another object that overlaps in any way the storage of the first object, then the overlap shall be exact and the two objects shall have qualified or unqualified versions of a compatible type; otherwise, the behavior is undefined.

So there you go, as long as struct fd_set is a actually a regular C struct, you're guaranteed success. It does depend, however, on your compiler emitting some kind of code to do it, or relying on whatever memcpy() intrinsic it uses for structure assignment. If your platform can't link against the compiler's intrinsic libraries for some reason, it may not work.

You will have to play some tricks if you have more open file descriptors than will fit into struct fd_set. The linux man page says:

An fd_set is a fixed size buffer. Executing FD_CLR() or FD_SET() with a value of fd that is negative or is equal to or larger than FD_SETSIZE will result in undefined behavior. Moreover, POSIX requires fd to be a valid file descriptor.

As mentioned below, it might not be worth the effort to prove that your code is safe on all systems. FD_COPY() is provided for just such a use, and is, presumably, always guaranteed:

FD_COPY(&fdset_orig, &fdset_copy) replaces an already allocated &fdset_copy file descriptor set with a copy of &fdset_orig.

101

answered Oct 22 '22 13:10

Carl Norum

First of all, there is no struct fd_set. It's simply called fd_set. However, POSIX does require it to be a struct type, so copying is well-defined.

Secondly, there is no way under standard C in which the fd_set object could contain dynamically allocated memory, since there is no requirement to use any function/macro to free it before returning. Even if the compiler has alloca (a pre-vla extension for stack-based allocation), fd_set could not use memory allocated on the stack, because a program might pass a pointer to the fd_set to another function which uses FD_SET, etc., and the allocated memory would cease to be valid as soon as it returns to the caller. Only if the C compiler offered some extension for destructors could fd_set use dynamic allocation.

In conclusion, it seems to be safe just to assign/memcpy fd_set objects, but to be sure, I would do something like:

#ifndef FD_COPY
#define FD_COPY(dest,src) memcpy((dest),(src),sizeof *(dest))
#endif

or alternatively just:

#ifndef FD_COPY
#define FD_COPY(dest,src) (*(dest)=*(src))
#endif

Then you'll use the system's provided FD_COPY macro if it exists, and only fall back to the theoretically-potentially-unsafe version if it's missing.

answered Oct 22 '22 14:10

R.. GitHub STOP HELPING ICE

You are correct that POSIX doesn't guarantee that copying a fd_set has to "work". I'm not personally aware of anywhere that it doesn't, but then I've never done the experiment.

You can use the poll() alternative (which is also POSIX). It works in a very similar way to select(), except that the input/output parameter is not opaque (and contains no pointers, so a bare memcpy will work), and its design also entirely removes the need to make a copy of the "requested file descriptors" structure (because the "requested events" and "returned events" are stored in different fields).

You are also correct to surmise that select() (and poll()) don't scale particularly well to large numbers of file descriptors - this is because every time the function returns, you must loop through every file descriptor to test if there was activity on it. The solutions to this are various non-standard interfaces (eg. Linux's epoll(), FreeBSD's kqueue), which you may need to look into if you find you are having latency problems.

answered Oct 22 '22 14:10

caf

Related questions
                            
                                what happened to syscalls.h?
                            
                                Is it possible to call a non-exported function that resides in an exe?
                            
                                Why can't getaddrinfo be found when compiling with gcc and std=c99
                            
                                Listen to multiple ports from one server
                            
                                Get warning when a variable is shadowed
                            
                                structure tag in C vs C++
                            
                                Why would it be illegal to inform about “abort”?
                            
                                How to use getaddrinfo_a to do async resolve with glibc
                            
                                Is "long" still useful in C?
                            
                                What should I #include to use 'htonl'?
                            
                                JSON Serialization in C
                            
                                C macros and use of arguments in parentheses
                            
                                What is the practical difference between a static function and a function with the "hidden" visibility attribute?
                            
                                c - udp send and receive on the same socket
                            
                                Why compilers no longer optimize this UB with strict aliasing
                            
                                C: Return value via stack/register question
                            
                                Making recvfrom() function non-blocking
                            
                                SDL embed image inside program executable
                            
                                How to call a C++ constructor from a C-File
                            
                                GCC define function-like macros using -D argument

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Are there any platforms where using structure copy on an fd_set (for select() or pselect()) causes problems?

Tags:

c

linux

unix

posix

Jonathan Leffler

People also ask

3 Answers

Carl Norum

R.. GitHub STOP HELPING ICE

caf

Recent Activity

Donate For Us