Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

.net File.Copy very slow when copying many small files (not over network)

I'm making a simple folder sync backup tool for myself and ran into quite a roadblock using File.Copy. Doing tests copying a folder of ~44,000 small files (Windows mail folders) to another drive in my system, I found that using File.Copy was over 3x slower than using a command line and running xcopy to copy the same files/folders. My C# version takes over 16+ minutes to copy the files, whereas xcopy takes only 5 minutes. I've tried searching for help on this topic, but all I find is people complaining about slow file copying of large files over a network. This is neither a large file problem nor a network copying problem.

I found an interesting article about a better File.Copy replacement, but the code as posted has some errors which causes problems with the stack and I am nowhere near knowledgeable enough to fix the problems in his code.

Are there any common or easy ways to replace File.Copy with something more speedy?

like image 709
Guavaman Avatar asked Jul 06 '12 22:07

Guavaman


People also ask

Why does it take so long to copy lots of small files?

Slow file copying can be caused by storage issues, client issues, and server issues. On the file server that hosts the shared folder, copy the file to its local hard disk.

How can I copy multiple files faster?

Hold Ctrl and click multiple files to select them all, no matter where they are on the page. To select multiple files in a row, click the first one, then hold Shift while you click the last one. This lets you easily pick a large number of files to copy or cut.

Why does moving files take longer than copying?

Your computer doesn't replace the directory and leaves the file as-is. It goes through the file bit by bit and creates an identical copy. As you might've guessed, this takes time depending on the size of the file, even if you're working on the same drive.

Why is my copy speed so low when copying files?

When copying/moving, the calculation time/speed depends on not only the file size but the file number of its folder. It is recommended to separate to multiple sub-folders if you have a large number of files. In addition, please check if the computer is in heavy load when doing that copy. You could use Performance Monitor to do the check.

Why do small text files take so long to copy?

With tiny text files, new metadata needs to be transferred for each and every file. In real-world examples, I’ve seen disks capabale of copying at 100MB/s get as low as 1MB/s when copying small files. This is easier to see in simple animations.

What happens when you copy a large file to a disk?

When you copy a large file like a video, this information is copied once, and then all the data blocks are copied into place. With tiny text files, new metadata needs to be transferred for each and every file. In real-world examples, I’ve seen disks capabale of copying at 100MB/s get as low as 1MB/s when copying small files.

How to fix copy speed issues in Windows 10?

No matter you are trying to copy files to an internal drive or external drive, you can use the built-in tool to check its file system, and fix the errors that could slow down the copy speed in Windows 10. 1. Open This PC, right-click the drive you want to perform data transfer, choose Properties. 2.


2 Answers

There are two algorithms for faster file copy:

If source and destination are different disks Then:

  • One thread reading files continuously and storing in a buffer.
  • Another thread writing files continuously from that buffer.

If source and destination is same disk then:

  • Read a fixed chunk of bytes, say 8K at a time, no matter how many files that is.
  • Write that fixed chunk to destination, either in one file or in multiple files.

This way you will get significant performance.

Alternative is you just invoke xcopy from your .net code. Why bother doing it using File.Copy. You can capture xcopy output using Process.StandardOutput and show on the screen in order to show user what's going on.

like image 153
oazabir Avatar answered Oct 22 '22 04:10

oazabir


One thing to consider is whether your copy has a user interface that updates during the copy. If so, make sure your copy is running on a separate thread, or both your UI will freeze up during the copy, and the copy will be slowed down by making blocking calls to update the UI.

I have written a similar program and in my experience, my code ran faster than a windows explorer copy (not sure about xcopy from the command prompt).

Also if you have a UI, don't update on every file; instead update every X megabytes or every Y files (whichever comes first), this keeps down the amount of updating to something the UI can actually handle. I used every .5MB or 10 files; those may not be optimal but it noticeably increased my copy speed and UI responsiveness.

Another way to speed things up is to use the Enumerate functions instead of Get functions (e.g. EnumerateFiles instead of GetFiles). These functions start returning results as soon as possible instead of waiting to return everything when the list is finished being built. They return an Enumerable, so you can just call foreach on the result: foreach(string file in System.IO.Directory.EnumerateDirectories(path)). For my program this also made a noticeable difference in speed, and would be even more helpful in cases like yours where you are dealing with directories containing many files.

like image 26
mikeagun Avatar answered Oct 22 '22 02:10

mikeagun