Why performance of C++ fseek/fread is times greater than C# FileStream's Seek/Read

Question

I'm doing pretty simple test:

Have a big file with random binary information with the size of ~6Gb
Algorithm makes a loop of "SeekCount" repetitions
Each repetition is doing the following:
- Calculates random offset within the range of file size
- Seeks to that offset
- Reads the small block of data

C#:

    public static void Test()
    {
        string fileName = @"c:\Test\big_data.dat";
        int NumberOfSeeks = 1000;
        int MaxNumberOfBytes = 1;
        long fileLength = new FileInfo(fileName).Length;
        FileStream stream = new FileStream(fileName, FileMode.Open, FileAccess.Read, FileShare.Read, 65536, FileOptions.RandomAccess);
        Console.WriteLine("Processing file \"{0}\"", fileName);
        Random random = new Random();
        DateTime start = DateTime.Now;
        byte[] byteArray = new byte[MaxNumberOfBytes];

        for (int index = 0; index < NumberOfSeeks; ++index)
        {
            long offset = (long)(random.NextDouble() * (fileLength - MaxNumberOfBytes - 2));
            stream.Seek(offset, SeekOrigin.Begin);
            stream.Read(byteArray, 0, MaxNumberOfBytes);
        }

        Console.WriteLine(
            "Total processing time time {0} ms, speed {1} seeks/sec
",
            DateTime.Now.Subtract(start).TotalMilliseconds, NumberOfSeeks / (DateTime.Now.Subtract(start).TotalMilliseconds / 1000.0));

        stream.Close();
    }

Then doing same test in C++:

void test()
{
     FILE* file = fopen("c:\Test\big_data.dat", "rb");

char buf = 0;
__int64 fileSize = 6216672671;//ftell(file);
__int64 pos;

DWORD dwStart = GetTickCount();
for (int i = 0; i < kTimes; ++i)
{
    pos = (rand() % 100) * 0.01 * fileSize;
    _fseeki64(file, pos, SEEK_SET);
    fread((void*)&buf, 1 , 1,file);
}
DWORD dwEnd = GetTickCount() - dwStart;
printf(" - Raw Reading: %d times reading took %d ticks, e.g %d sec. Speed: %d items/sec
", kTimes, dwEnd, dwEnd / CLOCKS_PER_SEC, kTimes / (dwEnd / CLOCKS_PER_SEC));
fclose(file);
}

Execution times:

C#: 100-200 reads / sec
C++: 250 000 reads / sec (250 thousands)

Question: why does C++ is thousands times faster than C# on such a trivial operation as file read?

Additional information:

I played with stream buffers and set them to the same size (4Kb)
Disk is de-fragmented (0% fragmentation)
OS configuration: Windows 7, NTFS, some latest modern 500Gb HDD (WD if recall correctly), 8 GB RAM (though it is almost not used), 4 Core CPU (utilization is almost zero)

David · Accepted Answer

There is an error in C++ version of the test - the calculation of random offset was limited and thus seek was made only within a short distance, which made C++ results look much better.

Correct code for calculating the offset was suggested by @MooingDuck:

rand()/double(RAND_MAX)*fileSize

With that change performance becomes comparable for both C++ and C# - around 200 reads/sec.

Thanks everyone for contributing.

Why performance of C++ fseek/fread is times greater than C# FileStream's Seek/Read

Tags:

c++

performance

c#

filestream

David

1 Answers

David

Recent Activity

Donate For Us

Why performance of C++ fseek/fread is times greater than C# FileStream's Seek/Read

Tags:

c++

performance

c#

filestream

David

1 Answers

David

Related questions

Recent Activity

Donate For Us