Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can someone explain this definition of the 'dirent' struct in Solaris?

Recently I was looking at the 'dirent' structure (in dirent.h) and was a little puzzled by its definition.

NOTE: This header file is from a Solaris machine at my school.


typedef struct dirent {
    ino_t       d_ino;
    off_t       d_off;
    unsigned short  d_reclen;
    char        d_name[1];
} dirent_t;

Particularly the d_name field. How would this work in the operating system? If you need to store a null terminated string what good is an array of a single char? I know that you can get the address of an array by its first element but I am still confused. Obviously something is happening, but I don't know what. On my Fedora Linux system at home this field is simply defined as:

char d_name[256];

Now that makes a lot more sense for obvious reasons. Can someone explain why the Solaris header file defines the struct as it does?

like image 631
Mr. Shickadance Avatar asked Feb 18 '09 22:02

Mr. Shickadance


2 Answers

As others have pointed out, the last member of the struct doesn't have any set size. The array is however long the implementation decides it needs to be to accommodate the characters it wants to put in it. It does this by dynamically allocating the memory for the struct, such as with malloc.

It's convenient to declare the member as having size 1, though, because it's easy to determine how much memory is occupied by any dirent variable d:

sizeof(dirent) + strlen(d.d_name)

Using size 1 also discourages the recipient of such struct values from trying to store their own names in it instead of allocating their own dirent values. Using the Linux definition, it's reasonable to assume that any dirent value you have will acept a 255-character string, but Solaris makes no guarantee that its dirent values will store any more characters than they need to.

I think it was C 99 that introduced a special case for the last member of a struct. The struct could be declared like this instead:

typedef struct dirent {
  ino_t d_ino;
  off_t d_off;
  unsigned short d_reclen;
  char d_name[];
} dirent_t;

The array has no declared size. This is known as the flexible array member. It accomplishes the same thing as the Solaris version, except that there's no illusion that the struct by itself could hold any name. You know by looking at it that there's more to it.

Using the "flexible" declaration, the amount of memory occupied would be adjusted like so:

sizeof(dirent) + strlen(d.d_name) + 1

That's because the flexible array member does not factor in to the size of the struct.

The reason you don't see flexible declarations like that more often, especially in OS library code, is likely for the sake of compatibility with older compilers that don't support that facility. It's also for compatibility with code written to target the current definition, which would break if the size of the struct changed like that.

like image 197
Rob Kennedy Avatar answered Nov 15 '22 10:11

Rob Kennedy


The dirent struct will be immediately followed in memory by a block of memory that contains the rest of the name, and that memory is accessible through the d_name field.

like image 26
Rob K Avatar answered Nov 15 '22 11:11

Rob K