Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does FileStream.Position increment in multiples of 1024?

Tags:

c#

file

file-io

I have a text file that I want to read line by line and record the position in the text file as I go. After reading any line of the file the program can exit, and I need to resume reading the file at the next line when it resumes.

Here is some sample code:

using (FileStream fileStream = new FileStream("Sample.txt", FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
    fileStream.Seek(GetLastPositionInFile(), SeekOrigin.Begin);
    using (StreamReader streamReader = new StreamReader(fileStream))
    {
        while (!streamReader.EndOfStream)
        {
            string line = streamReader.ReadLine();
            DoSomethingInteresting(line);
            SaveLastPositionInFile(fileStream.Position);

            if (CheckSomeCondition())
            {
                break;
            }
        }
    }
}

When I run this code, the value of fileStream.Position does not change after reading each line, it only advances after reading a couple of lines. When it does change, it increases in multiples of 1024. Now I assume that there is some buffering going on under the covers, but how can I record the exact position in the file?

like image 387
Stefan Moser Avatar asked Sep 23 '10 16:09

Stefan Moser


People also ask

Which method sets the position in the current FileStream?

Read Method (System.IO)

How does FileStream work C#?

The FileStream is a class used for reading and writing files in C#. It is part of the System.IO namespace. To manipulate files using FileStream, you need to create an object of FileStream class. This object has four parameters; the Name of the File, FileMode, FileAccess, and FileShare.

Why do we use FileStream?

FileStream Class is used to perform the basic operation of reading and writing operating system files. FileStream class helps in reading from, writing and closing files.

What is a FileStream?

A FILESTREAM filegroup is a special filegroup that contains file system directories instead of the files themselves. These file system directories are called data containers. Data containers are the interface between Database Engine storage and file system storage.


1 Answers

It's not FileStream that's responsible - it's StreamReader. It's reading 1K at a time for efficiency.

Keeping track of the effective position of the stream as far as the StreamReader is concerned is tricky... particularly as ReadLine will discard the line ending, so you can't accurately reconstruct the original data (it could have ended with "\n" or "\r\n"). It would be nice if StreamReader exposed something to make this easier (I'm pretty sure it could do so without too much difficulty) but I don't think there's anything in the current API to help you :(

By the way, I would suggest that instead of using EndOfStream, you keep reading until ReadLine returns null. It just feels simpler to me:

string line;
while ((line = reader.ReadLine()) != null)
{
    // Process the line
}
like image 133
Jon Skeet Avatar answered Sep 18 '22 08:09

Jon Skeet