Most efficient way to copy a file in Linux

Tags:

I am working at an OS independent file manager, and I am looking at the most efficient way to copy a file for Linux. Windows has a built in function, CopyFileEx(), but from what I've noticed, there is no such standard function for Linux. So I guess I will have to implement my own. The obvious way is fopen/fread/fwrite, but is there a better (faster) way of doing it? I must also have the ability to stop every once in a while so that I can update the "copied so far" count for the file progress menu.

404

asked Sep 18 '11 18:09

Radu

2 Answers

Unfortunately, you cannot use sendfile() here because the destination is not a socket. (The name sendfile() comes from send() + "file").

For zero-copy, you can use splice() as suggested by @Dave. (Except it will not be zero-copy; it will be "one copy" from the source file's page cache to the destination file's page cache.)

However... (a) splice() is Linux-specific; and (b) you can almost certainly do just as well using portable interfaces, provided you use them correctly.

In short, use open() + read() + write() with a small temporary buffer. I suggest 8K. So your code would look something like this:

int in_fd = open("source", O_RDONLY); assert(in_fd >= 0); int out_fd = open("dest", O_WRONLY); assert(out_fd >= 0); char buf[8192];  while (1) {     ssize_t read_result = read(in_fd, &buf[0], sizeof(buf));     if (!read_result) break;     assert(read_result > 0);     ssize_t write_result = write(out_fd, &buf[0], read_result);     assert(write_result == read_result); }

With this loop, you will be copying 8K from the in_fd page cache into the CPU L1 cache, then writing it from the L1 cache into the out_fd page cache. Then you will overwrite that part of the L1 cache with the next 8K chunk from the file, and so on. The net result is that the data in buf will never actually be stored in main memory at all (except maybe once at the end); from the system RAM's point of view, this is just as good as using "zero-copy" splice(). Plus it is perfectly portable to any POSIX system.

Note that the small buffer is key here. Typical modern CPUs have 32K or so for the L1 data cache, so if you make the buffer too big, this approach will be slower. Possibly much, much slower. So keep the buffer in the "few kilobytes" range.

Of course, unless your disk subsystem is very very fast, memory bandwidth is probably not your limiting factor. So I would recommend posix_fadvise to let the kernel know what you are up to:

posix_fadvise(in_fd, 0, 0, POSIX_FADV_SEQUENTIAL);

This will give a hint to the Linux kernel that its read-ahead machinery should be very aggressive.

I would also suggest using posix_fallocate to preallocate the storage for the destination file. This will tell you ahead of time whether you will run out of disk. And for a modern kernel with a modern file system (like XFS), it will help to reduce fragmentation in the destination file.

The last thing I would recommend is mmap. It is usually the slowest approach of all thanks to TLB thrashing. (Very recent kernels with "transparent hugepages" might mitigate this; I have not tried recently. But it certainly used to be very bad. So I would only bother testing mmap if you have lots of time to benchmark and a very recent kernel.)

[Update]

There is some question in the comments about whether splice from one file to another is zero-copy. The Linux kernel developers call this "page stealing". Both the man page for splice and the comments in the kernel source say that the SPLICE_F_MOVE flag should provide this functionality.

Unfortunately, the support for SPLICE_F_MOVE was yanked in 2.6.21 (back in 2007) and never replaced. (The comments in the kernel sources never got updated.) If you search the kernel sources, you will find SPLICE_F_MOVE is not actually referenced anywhere. The last message I can find (from 2008) says it is "waiting for a replacement".

The bottom line is that splice from one file to another calls memcpy to move the data; it is not zero-copy. This is not much better than you can do in userspace using read/write with small buffers, so you might as well stick to the standard, portable interfaces.

If "page stealing" is ever added back into the Linux kernel, then the benefits of splice would be much greater. (And even today, when the destination is a socket, you get true zero-copy, making splice more attractive.) But for the purpose of this question, splice does not buy you very much.

137

answered Oct 11 '22 19:10

Nemo

If you know they'll be using a linux > 2.6.17, splice() is the way to do zero-copy in linux:

 //using some default parameters for clarity below. Don't do this in production.  #define splice(a, b, c) splice(a, 0, b, 0, c, 0)  int p[2];  pipe(p);  int out = open(OUTFILE, O_WRONLY);  int in = open(INFILE, O_RDONLY)  while(splice(p[0], out, splice(in, p[1], 4096))>0);

answered Oct 11 '22 21:10

Dave

Related questions
                            
                                (How) Can I inline a particular function call?
                            
                                C standard compliant way to access null pointer address?
                            
                                If all bits are 0, what is the value of IEEE floating point?
                            
                                cast variable to int vs round() function
                            
                                Compile date and time using cmake
                            
                                I found some bizarre code in the linux kernel, could some one explain it to me?
                            
                                Do TCP connections get moved to another port after they are opened? [duplicate]
                            
                                Where can I find system call source code?
                            
                                Memory leak detectors for C?
                            
                                Equivalent to GetTickCount() on Linux
                            
                                Understanding Vertex Array Objects (glGenVertexArrays )
                            
                                How to create Unix Domain Socket with a specific permissions in C?
                            
                                c: size of void*
                            
                                Is it safe to use an enum in a bit field?
                            
                                Cross platform format string for variables of type size_t? [duplicate]
                            
                                What kind of data type is "long long"?
                            
                                Android: Java, C or C++?
                            
                                Unsigned hexadecimal constant in C?
                            
                                Is there an alternative to using time to seed a random number generation?
                            
                                Print the structure fields and values in C

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With