Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to read file N lines at a time?

I need to read a big file by reading at most N lines at a time, until EOF. What is the most effective way of doing it in Python? Something like:

with open(filename, 'r') as infile:
    while not EOF:
        lines = [get next N lines]
        process(lines)
like image 936
madprogrammer Avatar asked Apr 29 '11 13:04

madprogrammer


People also ask

How do I read multiple lines in a text file?

The built-in readline() method return one line at a time. To read multiple lines, call readline() multiple times. The built-in readline() method return one line at a time. To read multiple lines, call readline() multiple times.

How do I read 10 lines from a file in Python?

Use readlines() to Read the range of line from the File The readlines() method reads all lines from a file and stores it in a list. You can use an index number as a line number to extract a set of lines from it. This is the most straightforward way to read a specific line from a file in Python.

Can help you to read one line each time from the file?

The readline() method helps to read just one line at a time, and it returns the first line from the file given. We will make use of readline() to read all the lines from the file given. To read all the lines from a given file, you can make use of Python readlines() function.

How do I read a file line by line?

Java Read File line by line using BufferedReader We can use java. io. BufferedReader readLine() method to read file line by line to String. This method returns null when end of file is reached.


Video Answer


3 Answers

One solution would be a list comprehension and the slice operator:

with open(filename, 'r') as infile:
    lines = [line for line in infile][:N]

After this lines is tuple of lines. However, this would load the complete file into memory. If you don't want this (i.e. if the file could be really large) there is another solution using a generator expression and islice from the itertools package:

from itertools import islice
with open(filename, 'r') as infile:
    lines_gen = islice(infile, N)

lines_gen is a generator object, that gives you each line of the file and can be used in a loop like this:

for line in lines_gen:
    print line

Both solutions give you up to N lines (or fewer, if the file doesn't have that much).

like image 93
Martin Thurau Avatar answered Oct 12 '22 17:10

Martin Thurau


A file object is an iterator over lines in Python. To iterate over the file N lines at a time, you could use grouper() function in the Itertools Recipes section of the documenation. (Also see What is the most “pythonic” way to iterate over a list in chunks?):

try:
   from itertools import izip_longest
except ImportError:  # Python 3
    from itertools import zip_longest as izip_longest

def grouper(iterable, n, fillvalue=None):
    args = [iter(iterable)] * n
    return izip_longest(*args, fillvalue=fillvalue)

Example

with open(filename) as f:
     for lines in grouper(f, N, ''):
         assert len(lines) == N
         # process N lines here
like image 19
jfs Avatar answered Oct 12 '22 16:10

jfs


This code will work with any count of lines in file and any N. If you have 1100 lines in file and N = 200, you will get 5 times to process chunks of 200 lines and one time with 100 lines.

with open(filename, 'r') as infile:
    lines = []
    for line in infile:
        lines.append(line)
        if len(lines) >= N:
            process(lines)
            lines = []
    if len(lines) > 0:
        process(lines)
like image 16
Anatolij Avatar answered Oct 12 '22 16:10

Anatolij