Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Perl's 'readdir' Function Result Order?

Tags:

perl

readdir

I am running Perl in Windows and I am getting a list of all the files in a directory using readdir and storing the result in an array. The first two elements in the array seem to always be "." and "..". Is this order guaranteed (assuming the operating system does not change)?

I would like to do the following to remove these values:

my $directory = 'C:\\foo\\bar';

opendir my $directory_handle, $directory 
    or die "Could not open '$directory' for reading: $!\n";

my @files = readdir $directory_handle;
splice ( @files, 0, 2 ); # Remove the "." and ".." elements from the array

But I am worried that it might not be safe to do so. All the solutions I have seen use regular expressions or if statements for each element in the array and I would rather not use either of those approaches if I don't have to. Thoughts?

like image 678
Hans Goldman Avatar asked Dec 01 '22 16:12

Hans Goldman


2 Answers

There is no guarantee on the order of readdir. The docs state it...

Returns the next directory entry for a directory opened by opendir.

The whole thing is stepping through entries in the directory in whatever order they're provided by the filesystem. There is no guarantee what this order may be.

The usual way to work around this is with a regex or string equality.

my @dirs = grep { !/^\.{1,2}\z/ } readdir $dh;

my @dirs = grep { $_ ne '.' && $_ ne '..' } readdir $dh;

Because this is such a common issue, I'd recommend using Path::Tiny->children instead of rolling your own. They'll have figured out the fastest and safest way to do it, which is to use grep to filter out . and ... Path::Tiny fixes a lot of things about Perl file and directory handling.

like image 84
Schwern Avatar answered Dec 06 '22 11:12

Schwern


This perlmonks thread from 2001 investigated this very issue, and Perl wizard Randal Schwartz concluded

readdir on Unix returns the underlying raw directory order. Additions and deletions to the directory use and free-up slots. The first two entries to any directory are always created as "dot" and "dotdot", and these entries are never deleted under normal operation.

However, if a directory entry for either of these gets incorrectly deleted (through corruption, or using the perl -U option and letting the superuser unlink it, for example), the next fsck run has to recreate the entry, and it will simply add it. Oops, dot and dotdot are no longer the first two entries!

So, defensive programming mandates that you do not count on the slot order. And there's no promise that dot and dotdot are the first two entries, because Perl can't control that, and the underlying OS doesn't promise it either.

like image 45
mob Avatar answered Dec 06 '22 10:12

mob