Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# I/O async (copyAsync): how to avoid file fragmentation?

Within a tool copying big files between disks, I replaced the System.IO.FileInfo.CopyTo method by System.IO.Stream.CopyToAsync. This allow a faster copy and a better control during the copy, e.g. I can stop the copy. But this create even more fragmentation of the copied files. It is especially annoying when I copy file of many hundreds megabytes.

How can I avoid disk fragmentation during copy?

With the xcopy command, the /j switch copies files without buffering. And it is recommended for very large file in TechNet It seems indeed to avoid file fragmentation (while a simple file copy within windows 10 explorer DOES fragment my file!)

A copy without buffering seems to be the opposite way than this async copy. Or it there any way to do async copy without buffering?

Here it my current code for aync copy. I let the default buffersize of 81920 bytes, i.e. 10*1024*size(int64).

I am working with NTFS file systems, thus 4096 bytes clusters.

EDIT: I updated the code with SetLength as suggested, added the FileOptions Async while creating the destinationStream and fix setting the attributes AFTER setting the time (otherwise, exception is thrown for ReadOnly files)

        int bufferSize = 81920; 
        try
        {
            using (FileStream sourceStream = source.OpenRead())
            {
                // Remove existing file first
                if (File.Exists(destinationFullPath))
                    File.Delete(destinationFullPath);

                using (FileStream destinationStream = File.Create(destinationFullPath, bufferSize, FileOptions.Asynchronous))
                {
                    try
                    {                             
                        destinationStream.SetLength(sourceStream.Length); // avoid file fragmentation!
                        await sourceStream.CopyToAsync(destinationStream, bufferSize, cancellationToken);
                    }
                    catch (OperationCanceledException)
                    {
                        operationCanceled = true;
                    }
                } // properly disposed after the catch
            }
        }
        catch (IOException e)
        {
            actionOnException(e, "error copying " + source.FullName);
        }

        if (operationCanceled)
        {
            // Remove the partially written file
            if (File.Exists(destinationFullPath))
                File.Delete(destinationFullPath);
        }
        else
        {
            // Copy meta data (attributes and time) from source once the copy is finished
            File.SetCreationTimeUtc(destinationFullPath, source.CreationTimeUtc);
            File.SetLastWriteTimeUtc(destinationFullPath, source.LastWriteTimeUtc);
            File.SetAttributes(destinationFullPath, source.Attributes); // after set time if ReadOnly!
        }

I fear also that the File.SetAttributes and Time at the end on my code could increase file fragmentation.

Is there a proper way to create a 1:1 asynchronous file copy without any file fragmentation, i.e. asking the HDD that the file steam get only contiguous sectors?

Other topics regarding file fragmentation like How can I limit file fragmentation while working with .NET suggests incrementing the file size in larger chunks, but it does not seem to be a direct answer to my question.

like image 673
EricBDev Avatar asked Oct 24 '25 09:10

EricBDev


1 Answers

but the SetLength method does the job

It does not do the job. It only updates the file size in the directory entry, it does not allocate any clusters. The easiest way to see this for yourself is by doing this on a very large file, say 100 gigabytes. Note how the call completes instantly. Only way it can be instant is when the file system does not also do the job of allocating and writing the clusters. Reading from the file is actually possible, even though the file contains no actual data, the file system simply returns binary zeros.

This will also mislead any utility that reports fragmentation. Since the file has no clusters, there can be no fragmentation. So it only looks like you solved your problem.

The only thing you can do to force the clusters to be allocated is to actually write to the file. It is in fact possible to allocate 100 gigabytes worth of clusters with a single write. You must use Seek() to position to Length-1, then write a single byte with Write(). This will take a while on a very large file, it is in effect no longer async.

The odds that it will reduce fragmentation are not great. You merely reduced the risk somewhat that the writes will be interleaved by writes from other processes. Somewhat, actual writing is done lazily by the file system cache. Core issue is that the volume was fragmented before you began writing, it will never be less fragmented after you're done.

Best thing to do is to just not fret about it. Defragging is automatic on Windows these days, has been since Vista. Maybe you want to play with the scheduling, maybe you want to ask more about it at superuser.com

like image 95
2 revsHans Passant Avatar answered Oct 26 '25 23:10

2 revsHans Passant



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!