Python fastest way to read a large text file (several GB) [duplicate]

Tags:

i have a large text file (~7 GB). I am looking if exist the fastest way to read large text file. I have been reading about using several approach as read chunk-by-chunk in order to speed the process.

at example effbot suggest

# File: readline-example-3.py  file = open("sample.txt")  while 1:     lines = file.readlines(100000)     if not lines:         break     for line in lines:         pass # do something**strong text**

in order to process 96,900 lines of text per second. Other authors suggest to use islice()

from itertools import islice  with open(...) as f:     while True:         next_n_lines = list(islice(f, n))         if not next_n_lines:             break         # process next_n_lines

list(islice(f, n)) will return a list of the next n lines of the file f. Using this inside a loop will give you the file in chunks of n lines

458

asked Feb 18 '13 19:02

Gianni Spear

1 Answers

with open(<FILE>) as FileObj:     for lines in FileObj:         print lines # or do some other thing with the line...

will read one line at the time to memory, and close the file when done...

105

answered Oct 04 '22 06:10

Morten Larsen

Related questions
                            
                                What's the easiest way to add commas to an integer? [duplicate]
                            
                                Adding backslashes without escaping [duplicate]
                            
                                ImportError: No module named datetime
                            
                                Conditional command line arguments in Python using argparse
                            
                                Installing PIL with JPEG support on Mac OS X
                            
                                "cryptography is required for sha256_password or caching_sha2_password"
                            
                                Convert enum to int in python
                            
                                Filtering a pyspark dataframe using isin by exclusion [duplicate]
                            
                                Installing pyodbc fails on OSX 10.9 (Mavericks)
                            
                                How do you create an incremental ID in a Python Class
                            
                                How to delete the last column of data of a pandas dataframe
                            
                                django Datefield to Unix timestamp
                            
                                Mapping a class against multiple tables in SQLAlchemy
                            
                                Optimizing subgraph of large graph - slower than optimizing subgraph by itself
                            
                                How can I remove distortion introduced by librosa griffin lim?
                            
                                Twisted server crashes unexpectedly while running django
                            
                                Calling condition.wait() inside thread causes retrieval of any future to block on main thread
                            
                                Fitting a scikits.learn.hmm.GaussianHMM to variable length training sequences
                            
                                How to override the django admin translation?
                            
                                Python alternative to R Markdown [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python fastest way to read a large text file (several GB) [duplicate]

Tags:

performance

python

optimization

line

chunking

Gianni Spear

People also ask

1 Answers

Morten Larsen

Recent Activity

Donate For Us