Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do seemingly empty files and strings produce md5sums?

Consider the following:

% md5sum /dev/null
d41d8cd98f00b204e9800998ecf8427e  /dev/null
% touch empty; md5sum empty
d41d8cd98f00b204e9800998ecf8427e  empty
% echo '' | md5sum
68b329da9893e34099c7d8ad5cb9c940  -
% perl -e 'print chr(0)' | md5sum
93b885adfe0da089cdf634904fd59f71  -
% md5sum ''
md5sum: : No such file or directory

First of all, I'm surprised by the output of all these commands. If anything, I would expect the sum to be the same for all of them.

like image 776
Daniel Avatar asked Jun 06 '12 07:06

Daniel


People also ask

What is d41d8cd98f00b204e9800998ecf8427e?

The md5sum of "nothing" (a zero-length stream of characters) is d41d8cd98f00b204e9800998ecf8427e, which you're seeing in your first two examples.

How do I find the MD5 hash of a hidden file?

Type the following command: md5sum [type file name with extension here] [path of the file] -- NOTE: You can also drag the file to the terminal window instead of typing the full path. Hit the Enter key. You'll see the MD5 sum of the file.

Can you checksum a directory?

Checksums are calculated for files. Calculating the checksum for a directory requires recursively calculating the checksums for all the files in the directory. The -r option allows md5deep to recurse into sub-directories. The -l option enables displaying the relative path, instead of the default absolute path.

How do you find the checksum of a directory?

Run the md5sum command on every file in that list. Create a string that contains the list of file paths along with their hashes. And finally, run md5sum on this string we just created to obtain a single hash value.


3 Answers

The md5sum of "nothing" (a zero-length stream of characters) is d41d8cd98f00b204e9800998ecf8427e, which you're seeing in your first two examples.

The third and fourth examples are processing a single character. In the "echo" case, it's a newline, i.e.

$ echo -ne '\n' | md5sum
68b329da9893e34099c7d8ad5cb9c940 -

In the perl example, it's a single byte with value 0x00, i.e.

$ echo -ne '\x00' | md5sum
93b885adfe0da089cdf634904fd59f71 -

You can reproduce the empty checksum using "echo" as follows:

$ echo -n '' | md5sum
d41d8cd98f00b204e9800998ecf8427e -

...and using Perl as follows:

$ perl -e 'print ""' | md5sum
d41d8cd98f00b204e9800998ecf8427e  -

In all four cases, you should expect the same output from checksumming the same data, but different data should produce a wildly different checksum (that's the whole point -- even if it's only a single character that differs.)

like image 183
Graeme Avatar answered Oct 22 '22 03:10

Graeme


Why do seemingly empty files and strings produce md5sums?

Because the "sum" in the md5sum is somewhat misleading. It's not like e.g. CRC32 checksum, that is zero for the empty file.

MD5 is one of message digest algorithms. You can imagine it as a box that produces fixed-length random-looking value (hash) depending on its internal state. You change the internal state by feeding in the data.

And that box internal state is predefined, such that that it yields randomly looking hash value even before any data is fed in. For MD5, it happens to be d41d8cd98f00b204e9800998ecf8427e.

like image 39
mykhal Avatar answered Oct 22 '22 03:10

mykhal


No need for surprise. The first two produce true empty inputs to md5sum. The echo produces a newline (echo -n '' should produce an empty output; I don't have a linux machine here to check). The perl produces a single zero byte (not to be confused with C where a zero byte marks end of string). The last command is looking for a file with the empty string as its file name.

like image 3
Gene Avatar answered Oct 22 '22 03:10

Gene