Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find number of lines in csv without reading it [duplicate]

Tags:

python

csv

Does there exist a way of finding the number of lines in a csv file without actually loading the whole file in memory (in Python)?

I'd expect there can be some special optimized function for it. All I can imagine now is read it line by line and count the lines, but it kind of kills all the possible sense in it since I only need the number of lines, not the actual content.

like image 810
sashkello Avatar asked Dec 12 '22 11:12

sashkello


2 Answers

You don't need to load the whole file into memory since files are iterable in terms of their lines:

with open(path) as fp:
    count = 0
    for _ in fp:
        count += 1

Or, slightly more idiomatic:

with open(path) as fp:
    for (count, _) in enumerate(fp, 1):
       pass
like image 138
bereal Avatar answered Jan 21 '23 08:01

bereal


Yes you need to read the whole file in memory before knowing how many lines are in it. Just think the file to be a long long string Aaaaabbbbbbbcccccccc\ndddddd\neeeeee\n to know how many 'lines' are in the string you need to find how many \n characters are in it.

If you want an approximate number what you can do is to read few lines (~20) and see how many characters are per lines and then from the file's size (stored in the file descriptor) get a possible estimate.

like image 30
fabrizioM Avatar answered Jan 21 '23 08:01

fabrizioM