Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

file size changes when using cp

I came to the following problem. When I use cp command in Linux to copy file A to file B the sizes of these files are different, e.g. A is 6.4M, while B is 7.0M. It happens even if I do this in the same directory (to avoid different block sizes on different drives).

What is this about? How to avoid? Does it change file?

In addition there is some other strange behavior. If I cope file A to file B and try to check the size of B immediately it gives 0 or e.g 6.2M, after a while the size of B is 7.0M and constant. Is it possible that I capture some intermediate results of copying? Why is that so slow then?

like image 337
user938720 Avatar asked Feb 16 '26 16:02

user938720


2 Answers

Assuming coreutils du and cp.

When cp copies a file, it tries to preserve its "sparseness" using heuristics.

By default, sparse SOURCE files are detected by a crude heuristic and the corresponding DEST file is made sparse as well.

So if the heuristic fails, cp will create a plain file, without holes. In that case, the disk usage for the copy will be larger than the disk usage for the source – but the apparent file size should be identical, and the contents should be identical (try cmp).

Use stat to see both the apparent size and the disk usage for files (plus lots more info).

$ dd if=/dev/zero of=./sparse bs=1 count=1 seek=10240000
1+0 records in
1+0 records out
1 byte (1 B) copied, 1.4101e-05 s, 70.9 kB/s
$ cp sparse copy1
$ cp --sparse=never sparse copy2
$ ll
-rw-r--r-- 1 me users 10240001 Apr 28 17:59 copy1
-rw-r--r-- 1 me users 10240001 Apr 28 18:00 copy2
-rw-r--r-- 1 me users 10240001 Apr 28 17:59 sparse
$ du sparse copy*
4   sparse
4   copy1
10004   copy2
$ stat sparse copy*
  File: `sparse'
  Size: 10240001    Blocks: 8          IO Block: 4096   regular file
...
  File: `copy1'
  Size: 10240001    Blocks: 8          IO Block: 4096   regular file
...
  File: `copy2'
  Size: 10240001    Blocks: 20008      IO Block: 4096   regular file
$ cmp sparse copy1 && echo identical
identical
$ cmp sparse copy2 && echo identical
identical
like image 112
Mat Avatar answered Feb 19 '26 04:02

Mat


There were a number of bugs with FIEMAP:

http://lwn.net/Articles/429349/

http://lkml.indiana.edu/hypermail/linux/kernel/0906.1/00436.html

http://www.spinics.net/lists/linux-ext4/msg24337.html

So I suspect a buggy cp from coreutils that tries to use FIEMAP and a buggy filesystem in the kernel that doesn't handle FIEMAP correctly. Upgrade your kernel and coreutils package.

like image 31
Z.T. Avatar answered Feb 19 '26 05:02

Z.T.