I have a backup system that creates directories named after Unix Timestamps, and then creates incremental backups using a hardlink system (--link-dest in rsync), so typically the first backup is very large, and then later backups are fractions as big.
This is my output of my current backups:
root@athos:/media/awesomeness_drive# du -sh lantea_home/* 31G lantea_home/1384197192 17M lantea_home/1384205953 17M lantea_home/1384205979 17M lantea_home/1384206056 17M lantea_home/1384206195 17M lantea_home/1384207349 3.1G lantea_home/1384207678 14M lantea_home/1384208111 14M lantea_home/1384208128 16M lantea_home/1384232401 15G lantea_home/1384275601 43M lantea_home/1384318801
Everything seems correct, however, take for example the last directory, lantea_home/1384318801
:
root@athos:/media/awesomeness_drive# du -sh lantea_home/1384318801/ 28G lantea_home/1384318801/
I consistently get this behavior, why is the directory considered 28G by the second du command?
Note - the output remains the same with the -P and -L flags.
From man du : Files having multiple hard links are counted (and displayed) a single time per du execution. Directories having multiple hard links (typically Time Machine backups) are counted a single time per du execution.
The maximum number of hard links to a single file is limited by the size of the reference counter. On Unix-like systems the counter is 4,294,967,295 (on 32-bit machines) or 18,446,744,073,709,551,615 (on 64-bit machines).
Creating a hard link for a target file will increment the link count for that file's inode. For these reasons, hard links are also known as physical links. Notice that both the original file and the hard link are 13 bytes in size, have the same inode number, have the same permissions, and have a link count of 2 .
Displays file sizes in 1024-byte (1 KB) units. Reports files that cannot be opened and directories that cannot be read; this is the default. Does not display file size totals for subdirectories. Displays the total amount of space used by all path names examined.
Hardlinks are real references to the same file (represented by its inode). There is no difference between the "original" file and a hard link pointing to it as well. Both files have the same status, both are then references to this file. Removing one of them lets the other stay intact. Only removing the last hardlink will remove the file at last and free the disk space.
So if you ask du
what it sees in one directory only, it does not care that there are hardlinks elsewhere pointing to the same contents. It simply counts all the files' sizes and sums them up. Only hardlinks within the considered directory are not counted more than once. du
is that clever (not all programs necessarily need to be).
So in effect, directory A might have a du
size of 28G, directory B might have a size of 29G, but together they still only occupy 30G and if you ask du
of the size of A and B, you will get that number.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With