Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read text file at specific line [duplicate]

Tags:

c#

I have a text file which has more than 3000 lines. I am finding the number of lines using

string[] lines = File.ReadAllLines(myPath);
var lineCount = lines.Length; 

Then I am generating a random number

Random rand = new Random();
var lineToRead = rand.Next(1, lineCount);

Now I need to read the specific line that is generated by random number. I can do this using

string requiredLine = lines[lineToRead];

Because my file is big I don't think creating such a big array is efficient. Is there a more efficient or easier way to do this?

like image 979
asdfkjasdfjk Avatar asked Apr 03 '13 11:04

asdfkjasdfjk


2 Answers

Here is a solution which iterates the file twice (first time to count lines, next time to select line). The benefit is that you don't need to create an array of 3000 strings in memory. But, as mentioned above, it will possibly be slower. Why possibly? - because File.ReadAllLines creates a list of strings inside and that list will be resized many times while filling it with 3000 items. (Initial capacity will be 4. When the inner array is completely filled, then the new array of doubled size will be created and all strings will be copied there).

So, the solution uses File.ReadLines method which returns IEnumerable<string> with lines and skip lines you don't need:

IEnumerable<string> lines = File.ReadLines(myPath);
var lineToRead = rand.Next(1, lines.Count());
var line = lines.Skip(lineToRead - 1).First();

BTW, internally File.ReadLines uses SteamReader which reads file line by line.

like image 111
Sergey Berezovskiy Avatar answered Nov 13 '22 16:11

Sergey Berezovskiy


What you can do is parse the file to find the index of each line and then at a later time you can go back to a certain line by using Stream.Position to get the content. Using this method you don't need to keep anything in memory and it is reasonably fast. I tested this on a file that is 20K lines and 1MB in size. It took 7ms to index the file and 0.3to get the line.

    // Parse the file
    var indexes = new List<long>();
    using (var fs = File.OpenRead("text.txt"))
    {
        indexes.Add(fs.Position);
        int chr;
        while ((chr = fs.ReadByte()) != -1)
        {
            if (chr == '\n')
            {                        
                indexes.Add(fs.Position);
            }
        }
    }

    int lineCount = indexes.Count;
    int randLineNum = new Random().Next(0, lineCount - 1);
    string lineContent = "";


    // Read the random line
    using (var fs = File.OpenRead("text.txt"))
    {
        fs.Position = indexes[randLineNum];
        using (var sr = new StreamReader(fs))
        {
            lineContent = sr.ReadLine();
        }
    }
like image 37
Du D. Avatar answered Nov 13 '22 17:11

Du D.