I have several python modules that I've written. Randomly, I used file
on this directory, and I was really surprised by what I saw. Here's the resulting count of what it thought the files were:
1 ASCII Java program text, with very long lines
1 a /bin/env python script text executable
1 a python script text executable
2 ASCII C++ program text
4 ASCII English text
18 ASCII Java program text
That's strange! Any idea what's going on or why it seems to think python modules are very often java files?
I'm using CentOS 5.2.
Edit The question is more geared towards my curiosity on why obviously non-java and non-c++ program file were being classified as such. Certainly I don't expect file
to be perfect, but was surprised on the choices that were being made. I would have guessed it would just give up and say text file rather than making very incorrect inferences.
I just ran a test and in every case of incorrect identification, there was no shebang line.
For every file that had:
#!/usr/bin/env python
file
correctly identified it.
Looking at the magic
file, another thing that triggers recognition as a Python file is a triple quote on the first line.
$ echo '"""' | file -
/dev/stdin: python script text executable
$ echo '#!/usr/bin/python' | file -
/dev/stdin: python script text executable
$ echo '#!/usr/bin/env python' | file -
/dev/stdin: a python script text executable
From the file man page
File tests each argument in an attempt to classify it. There are three sets of tests, performed in this order: filesystem tests, magic number tests, and language tests. The first test that succeeds causes the file type to be printed.
My guess is that some of your files happen to match tests for different languages and incorrectly identify the file.
Also, file is generally intended for binary files, as the bugs section indicates.
file uses several algorithms that favor speed over accuracy, thus it can be misled about the contents of text files.
The support for text files (primarily for programming languages) is simplistic, inefficient and requires recompilation to update.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With