Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Determining binary/text file type in Java?

Namely, how would you tell an archive (jar/rar/etc.) file from a textual (xml/txt, encoding-independent) one?

like image 207
yanchenko Avatar asked Mar 07 '09 00:03

yanchenko


People also ask

How can you tell if a file is text or binary?

We can usually tell if a file is binary or text based on its file extension. This is because by convention the extension reflects the file format, and it is ultimately the file format that dictates whether the file data is binary or text.

Are .java files binary?

Java binary files are platform independent. They can be interpreted by any computer that supports Java. A stream is a device for transmitting or retrieving 8-bit or byte values. The emphasis is on the action of reading or writing as opposed to the data itself.

Is .class binary file in java?

Last modified on August 1st, 2014 by Joe. Have you ever wondered what is inside a java class file? Java source file is compiled into a binary class file. JVM specification states many rules on how a java binary class should be in order to provide binary compatibility.


1 Answers

There's no guaranteed way, but here are a couple of possibilities:

  1. Look for a header on the file. Unfortunately, headers are file-specific, so while you might be able to find out that it's a RAR file, you won't get the more generic answer of whether it's text or binary.

  2. Count the number of character vs. non-character types. Text files will be mostly alphabetical characters while binary files - especially compressed ones like rar, zip, and such - will tend to have bytes more evenly represented.

  3. Look for a regularly repeating pattern of newlines.

like image 66
Aric TenEyck Avatar answered Sep 17 '22 13:09

Aric TenEyck