Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to count lines fast?

I tried unxutils' wc -l but it crashed for 1GB files. I tried this C# code

long count = 0; using (StreamReader r = new StreamReader(f)) {     string line;     while ((line = r.ReadLine()) != null)     {         count++;     } }  return count; 

It reads a 500MB file in 4 seconds

var size = 256; var bytes = new byte[size]; var count = 0; byte query = Convert.ToByte('\n'); using (var stream = File.OpenRead(file)) {     int many;     do     {         many = stream.Read(bytes, 0, size);         count += bytes.Where(a => a == query).Count();                         } while (many == size); } 

Reads in 10 seconds

var count = 0; int query = (int)Convert.ToByte('\n'); using (var stream = File.OpenRead(file)) {     int current;     do     {         current = stream.ReadByte();         if (current == query)         {             count++;             continue;         }     } while (current!= -1); } 

Takes 7 seconds

Is anything faster I haven't tried yet?

like image 707
Jader Dias Avatar asked May 23 '11 18:05

Jader Dias


People also ask

How do you count the number of lines?

The command “wc” basically means “word count” and with different optional parameters one can use it to count the number of lines, words, and characters in a text file. Using wc with no options will get you the counts of bytes, lines, and words (-c, -l and -w option).

How do I count the number of lines in a file?

The wc command is used to find the number of lines, characters, words, and bytes of a file. To find the number of lines using wc, we add the -l option. This will give us the total number of lines and the name of the file.

What command is used to count total lines?

Use the wc command to count the number of lines, words, and bytes in the files specified by the File parameter.

How do I count the number of lines in a string C++?

The only way to find the line count is to read the whole file and count the number of line-end characters. The fastest way to do this is probably to read the whole file into a large buffer with one read operation and then go through the buffer counting the '\n' characters.


1 Answers

File.ReadLines was introduced in .NET 4.0

var count = File.ReadLines(file).Count(); 

works in 4 seconds, the same time as the first code snippet

like image 188
Jader Dias Avatar answered Oct 04 '22 03:10

Jader Dias