Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I read large text files in Python, line by line, without loading it into memory?

Tags:

python

People also ask

How does Python read files without memory loading?

The input() method of fileinput module can be used to read files. The advantage of using this method over readlines() is fileinput. input() does not load the entire file into memory. Hence, there is no chance of memory leakage.

How do I read a text file line by line in Python?

Method 1: Read a File Line by Line using readlines() readlines() is used to read all the lines at a single go and then return them as each line a string element in a list. This function can be used for small files, as it reads the whole file content to the memory, then split it into separate lines.

How do you process a large text file in Python?

Reading Large Text Files in Python We can use the file object as an iterator. The iterator will return each line one by one, which can be processed. This will not read the whole file into memory and it's suitable to read large files in Python.


I provided this answer because Keith's, while succinct, doesn't close the file explicitly

with open("log.txt") as infile:
    for line in infile:
        do_something_with(line)

All you need to do is use the file object as an iterator.

for line in open("log.txt"):
    do_something_with(line)

Even better is using context manager in recent Python versions.

with open("log.txt") as fileobject:
    for line in fileobject:
        do_something_with(line)

This will automatically close the file as well.


An old school approach:

fh = open(file_name, 'rt')
line = fh.readline()
while line:
    # do stuff with line
    line = fh.readline()
fh.close()

You are better off using an iterator instead. Relevant: http://docs.python.org/library/fileinput.html

From the docs:

import fileinput
for line in fileinput.input("filename"):
    process(line)

This will avoid copying the whole file into memory at once.