Does there exist a way of finding the number of lines in a csv file without actually loading the whole file in memory (in Python)?
I'd expect there can be some special optimized function for it. All I can imagine now is read it line by line and count the lines, but it kind of kills all the possible sense in it since I only need the number of lines, not the actual content.
You don't need to load the whole file into memory since files are iterable in terms of their lines:
with open(path) as fp:
count = 0
for _ in fp:
count += 1
Or, slightly more idiomatic:
with open(path) as fp:
for (count, _) in enumerate(fp, 1):
pass
Yes you need to read the whole file in memory before knowing how many lines are in it. Just think the file to be a long long string Aaaaabbbbbbbcccccccc\ndddddd\neeeeee\n to know how many 'lines' are in the string you need to find how many \n characters are in it.
If you want an approximate number what you can do is to read few lines (~20) and see how many characters are per lines and then from the file's size (stored in the file descriptor) get a possible estimate.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With