Perl has a lovely little utility called find2perl that will translate (quite faithfully) a command line for the Unix find
utility into a Perl script to do the same.
If you have a find command like this:
find /usr -xdev -type d -name '*share'
^^^^^^^^^^^^ => name with shell expansion of '*share'
^^^^ => Directory (not a file)
^^^ => Do not go to external file systems
^^^ => the /usr directory (could be multiple directories
It finds all the directories ending in share
below /usr
Now run find2perl /usr -xdev -type d -name '*share'
and it will emit a Perl script to do the same. You can then modify the script to your use.
Python has os.walk()
which certainly has the needed functionality, recursive directory listing, but there are big differences.
Take the simple case of find . -type f -print
to find and print all files under the current directory. A naïve implementation using os.walk()
would be:
for path, dirs, files in os.walk(root):
if files:
for file in files:
print os.path.join(path,file)
However, this will produce different results than typing find . -type f -print
in the shell.
I have also been testing various os.walk() loops against:
# create pipe to 'find' with the commands with arg of 'root'
find_cmd='find %s -type f' % root
args=shlex.split(find_cmd)
p=subprocess.Popen(args,stdout=subprocess.PIPE)
out,err=p.communicate()
out=out.rstrip() # remove terminating \n
for line in out.splitlines()
print line
The difference is that os.walk() counts links as files; find skips these.
So a correct implementation that is the same as file . -type f -print
becomes:
for path, dirs, files in os.walk(root):
if files:
for file in files:
p=os.path.join(path,file)
if os.path.isfile(p) and not os.path.islink(p):
print(p)
Since there are hundreds of permutations of find primaries and different side effects, this becomes time consuming to test every variant. Since find
is the gold standard in the POSIX world on how to count files in a tree, doing it the same way in Python is important to me.
So is there an equivalent of find2perl
that can be used for Python? So far I have just been using find2perl
and then manually translating the Perl code. This is hard because the Perl file test operators are different than the Python file tests in os.path at times.
If you're trying to reimplement all of find
, then yes, your code is going to get hairy. find
is pretty hairy all by itself.
In most cases, though, you're not trying to replicate the complete behavior of find; you're performing a much simpler task (e.g., "find all files that end in .txt"). If you really need all of find
, just run find
and read the output. As you say, it's the gold standard; you might as well just use it.
I often write code that reads paths on stdin
just so I can do this:
find ...a bunch of filters... | my_python_code.py
There are a couple of observations and several pieces of code to help you on your way.
First, Python can execute code in this form just like Perl:
cat code.py | python | the rest of the pipe story...
find2perl
is a clever code template that emits a Perl function based on a template of find. Therefor, replicate this template and you will not have the "hundreds of permutations" that you are perceiving.
Second, the results from find2perl
are not perfect just as there are potentially differences between versions of find, such as GNU or BSD.
Third, by default, os.walk
is bottom up; find
is top down. This makes for different results if your underlying directory tree is changing while you recurse it.
There are two projects in Python that may help you: twander and dupfinder. Each strives to be os independent and each recurses the file system like find
.
If you template a general find
like function in Python, set os.walk
to recurse top down, use glob to replicate shell expansion, and use some of the code that you find in those two projects, you can replicate find2perl
without too much difficulty.
Sorry I could not point to something ready to go for your needs...
I think glob could help in your implementation of this.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With