Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

why does the C readdir man page say to not call free on the static allocated result struct

Tags:

c

unix

dirent.h

$ uname -a

Linux crowsnest 2.6.32-28-generic #55-Ubuntu SMP Mon Jan 10 23:42:43 UTC 2011 x86_64 GNU/Linux

$ man readdir:

DESCRIPTION

The readdir() function returns a pointer to a dirent structure representing the next directory entry in the directory stream pointed to by dirp...

..[snip]...

The readdir_r() function is a reentrant version of readdir()...

...[snip]...

RETURN VALUE

On success, readdir() returns a pointer to a dirent structure. (This structure may be statically allocated; do not attempt to free(3) it.) If the end of the directory stream is reached, NULL is returned and errno is not changed. If an error occurs, NULL is returned and errno is set appropriately.

The readdir_r() function returns 0 on success. On error, it returns a positive error number. If the end of the directory stream is reached, readdir_r() returns 0, and returns NULL in *result.

I'm confused about what this means, my application of this function is to collect a dynamically allocated array of pointers to structs with data about the directory entries, and I'm wondering if I can dynamically allocate dirent structs and set the pointers to them. but this line seams to say that the result should never be called by free, so I'm wondering if I should allocate a seperate dirent struct which will be part of the list and memcpy it over the returned result.

I'm also confused by the terminology of "may" in the above man page. does this mean that somtimes it's statically allocated, and sometimes it's not.

I'm familiar, (vaguely) with what static variables mean in C, but not sure about all the rules and possible gotcha's arround them. because I want to pass the dirent structs that are in a directory around, I would rather it be dynamically allocated. is this what readdir_r is for? or will the double pointer be set to point to another statically allocated dirent struct?

and I'm not entirely sure what reentrant means in this context for readdir_r. my understanding of renetrant is only from scheme coroutines which I'm not sure how that would apply to reading unix directories.

like image 837
Fire Crow Avatar asked Aug 23 '11 09:08

Fire Crow


People also ask

What does Readdir do in C?

The readdir() function returns a pointer to a structure representing the directory entry at the current position in the directory stream specified by the argument dirp, and positions the directory stream at the next entry. It returns a null pointer upon reaching the end of the directory stream.

How does Readdir indicate that there are no more entries left in a directory?

If there are no more entries in the directory or an error is detected, readdir returns a null pointer. The following errno error conditions are defined for this function: EBADF.

What is struct dirent in C?

Data Type: struct dirent. This is a structure type used to return information about directory entries. It contains the following fields: char d_name[] This is the null-terminated file name component. This is the only field you can count on in all POSIX systems.

Does Opendir allocate memory?

opendir() may allocate memory from the user's heap. Files that are added to the directory after the first call to readdir() following an opendir() or rewinddir() may not be returned on calls to readdir(), and files that are removed may still be returned on calls to readdir().


2 Answers

The structure might be statically-allocated, it might be thread-local, it might be dynamically allocated. That's up to the implementation. But no matter what, it's not yours to free, which is why you must not free it.

readdir_r doesn't allocate anything for you, you give it a dirent, allocated however you like, and it fills it in. Therefore it does save you a little bit of effort compared with calling readdir and copying the dir data. That's not the main purpose of readdir_r, though, what it's actually for is the ability to make calls from different threads at the same time, which you can't do with readdir.

What "reentrant" actually means, is that the function can be called again before a previous call to it has returned. In general, this might mean from a different thread (which is what most people mean by "thread-safe"), from a handler for a signal that occurred during the first call, or due to recursion. But the C standard has no concept of threads, so it mentions "reentrant" meaning only the latter two. Posix defines "thread-safe" to require this form of reentrancy and, in addition, the thing that most people mean by thread-safe.

In Posix, every function required to be thread-safe is required to be reentrant, and readdir_r is required to be thread-safe. I think reentrancy in the weaker sense is irrelevant to readdir_r, since it doesn't call any user code that could result in recursion, and it's not async-signal-safe so it must not be called from a signal handler either.

Beware, because when some people (Java programmers) say "thread-safe", they mean that the function can be called by different threads on the same arguments at the same time, and will use locks to work correctly. Posix APIs do not mean this by thread-safe, they only mean that the function can be called on different data at the same time. Any global data that the function uses is protected by locks or otherwise, but the arguments need not be.

like image 160
Steve Jessop Avatar answered Nov 15 '22 04:11

Steve Jessop


The rule here is really simple -- you're free to make a copy of the data readdir() returns, however you don't own the buffer it puts that data in so you cannot take actions that suggest you do. (I.e., copy the data out to your own buffer; don't store a pointer to within the readdir-owned buffer.)

so I'm wondering if I should allocate a seperate dirent struct which will be part of the list and memcpy it over the returned result - that's exactly what you should do.

I'm also confused by the terminology of "may" in the above man page. does this mean that somtimes it's statically allocated, and sometimes it's not. - it means you cannot count on how it will be managed, but it will be managed for you. The details could vary from one system to the next.

Reentrant means thread-safe. readdir() uses a static entry, making it not safe for multiple threads to use as if they each control the multi-call process. readdir_r() will use allocated space provided by the caller, letting multiple threads act independently.

like image 21
mah Avatar answered Nov 15 '22 04:11

mah