Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does Linux recognize a file as a certain file type, and how to programmatically change it?

I am crating a program in java that reads an Input Stream of a file, encrypts it by changing around the numbers of the bytes based on what the password is, and creates a new encrypted file.

For example:
I created a test file that contained the words:
This is a test to see if the encrypter project works.
When I read the bytes in java, I get:
[84, 104, 105, 115, 32, 105, 115, 32, 97, 32, 116, 101, 115, 116, 32, 116, 111, 32, 115, 101, 101, 32, 105, 102, 32, 116, 104, 101, 32, 101, 110, 99, 114, 121, 112, 116, 101, 114, 32, 112, 114, 111, 106, 101, 99, 116, 32, 119, 111, 114, 107, 115, 46, 10]
So then I take the value of each byte, and subtract the unicode value of the passwords, and get the absolute value of that. Then I write that to a file.

I was playing around with different algorithms to encrypt it, and started testing it out on a test text file. I am using Linux, so so there are no file extensions (eg. .txt, .pdf, etc...) I noticed after a few times of encrypting it, that the computer no longer recognized it as a text file, but instead, as an image file! (meaning when you click on it, by default, it tries to open the file in an image editor)

So here are my questions:

1. What causes the computer to recognize a file as a certain file type?

  • I am guessing that it has something to do with certain bytes that it looks at somewhere in the file, but beyond that, I'm lost.

2. Where in the file is this information stored?

  • I was hoping to be able to keep the file as the same file-type even after encryption, so I was thinking, that if, for example, the file-type information was in the first 10 bytes, I would encrypt everything after that, but leave those first 10 bytes alone for example.

3. Is file-type information standard?

  • Do these bytes have a meaning that is standard across all platforms (ie. a pdf file is a pdf file no mater what computer you use it on. Is that because of the .pdf extension, or is it because of the bytes that are somewhere in the file.)

4. Assuming the file-type is recognized because of bytes in the file, how can I change the file-type?

  • Where can I find a listing of what bytes mean what in a file?
like image 286
Ephraim Avatar asked Apr 12 '12 20:04

Ephraim


People also ask

How does Linux recognize file type?

To find out file types we can use the file command. Using the -s option we can read the block or character special file. Using -F option will use string as separator instead of “:”. We can use the –extension option to print a slash-separated list of valid extensions for the file type found.

How do I change the file type in Linux?

Change File Extensions From the Terminal And if you want to change the extension (or the name), you'd use the mv command. mv stands for "move" and is the standard command on Linux for moving and renaming files.

Which command can be used to determine file type by its content in Linux?

The file command uses the /etc/magic file to identify files that have a magic number; that is, any file containing a numeric or string constant that indicates the type. This displays the file type of myfile (such as directory, data, ASCII text, C program source, or archive).

How is file type determined?

file determines the format of each file by inspecting the attributes and (for a regular file) reading the contents of the file. If the file is an executable, its addressing mode is determined for output.


1 Answers

On traditional UNIX systems, files are identified solely by looking for particular patterns of bytes appearing in the file.

The file command uses a magic configuration file (often /etc/magic, or /usr/share/file/magic) which contains the rules defining those byte patterns.

That's it - there's no special extra meta-data - it's all done by analysis of the content.

like image 173
Alnitak Avatar answered Sep 18 '22 14:09

Alnitak