Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Lazy Method for Reading Big File in Python?

I have a very big file 4GB and when I try to read it my computer hangs. So I want to read it piece by piece and after processing each piece store the processed piece into another file and read next piece.

Is there any method to yield these pieces ?

I would love to have a lazy method.

like image 395
Pratik Deoghare Avatar asked Feb 06 '09 09:02

Pratik Deoghare


People also ask

How do I open a big file as a binary file in Python?

You can open the file using open() method by passing b parameter to open it in binary mode and read the file bytes. open('filename', "rb") opens the binary file in read mode.

How does Python handle large files?

Reading Large Text Files in Python We can use the file object as an iterator. The iterator will return each line one by one, which can be processed. This will not read the whole file into memory and it's suitable to read large files in Python.

How do I read a chunk file in Python?

Reading Large File in Python To read a large file in chunk, we can use read() function with while loop to read some chunk data from a text file at a time.


1 Answers

To write a lazy function, just use yield:

def read_in_chunks(file_object, chunk_size=1024):     """Lazy function (generator) to read a file piece by piece.     Default chunk size: 1k."""     while True:         data = file_object.read(chunk_size)         if not data:             break         yield data   with open('really_big_file.dat') as f:     for piece in read_in_chunks(f):         process_data(piece) 

Another option would be to use iter and a helper function:

f = open('really_big_file.dat') def read1k():     return f.read(1024)  for piece in iter(read1k, ''):     process_data(piece) 

If the file is line-based, the file object is already a lazy generator of lines:

for line in open('really_big_file.dat'):     process_data(line) 
like image 89
nosklo Avatar answered Sep 22 '22 09:09

nosklo