i have a large text file (~7 GB). I am looking if exist the fastest way to read large text file. I have been reading about using several approach as read chunk-by-chunk in order to speed the process.
at example effbot suggest
# File: readline-example-3.py file = open("sample.txt") while 1: lines = file.readlines(100000) if not lines: break for line in lines: pass # do something**strong text**
in order to process 96,900 lines of text per second. Other authors suggest to use islice()
from itertools import islice with open(...) as f: while True: next_n_lines = list(islice(f, n)) if not next_n_lines: break # process next_n_lines
list(islice(f, n))
will return a list of the next n
lines of the file f
. Using this inside a loop will give you the file in chunks of n
lines
Method 1: The first approach makes use of iterator to iterate over the file. In this technique, we use the fileinput module in Python. The input() method of fileinput module can be used to read files.
To read a text file in Python, you follow these steps: First, open a text file for reading by using the open() function. Second, read text from the text file using the file read() , readline() , or readlines() method of the file object. Third, close the file using the file close() method.
Reading Large Text Files in Python We can use the file object as an iterator. The iterator will return each line one by one, which can be processed. This will not read the whole file into memory and it's suitable to read large files in Python.
with open(<FILE>) as FileObj: for lines in FileObj: print lines # or do some other thing with the line...
will read one line at the time to memory, and close the file when done...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With