I am using Apache Tika to detect the mime type of an input stream and I was wondering if there's a ready method to detect that this file is an executable file, there's a big list of executable files mime types here:
http://www.file-extensions.org/filetype/extension/name/program-executable-files
and I was wondering about the best way to cover them all.
Apache Tika's mime-types have a hierarchy. So, you don't need to check for all possible executable types, all you need to do is check if the detected type has a parent that's one of the handful of executable umbrella types
For Windows, the main one is application/x-msdownload
. You might also want to check for application/x-ms-installer
too
For Unix, the main one is application/x-elf
, but you potentially also want to check for the scripting formats such as application/x-sh
, text/x-perl
, text/x-python
etc.
As for how to go from a Mimetype in Tika to its parent, you'll want this existing answer here - "Correct use of Apache Tika MediaType". (Note that you need to recurse, in case there are multiple levels between the detected mime type and the base executable parent type)
for microsoft windows the mime type is application/x-msdownload
. look at this http://www.freeformatter.com/mime-types-list.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With