Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Memory error due to the huge input file size

Tags:

python

file

When I using the following code to read file:

lines=file("data.txt").read().split("\n")

I have the following error

MemoryError

the file size is

ls -l
-rw-r--r-- 1 charlie charlie 1258467201 Sep 26 12:57 data.txt
like image 304
Charlie Epps Avatar asked Mar 07 '10 12:03

Charlie Epps


2 Answers

Obviously the file is too large to be read into memory all at once.

Why not just use:

with open("data.txt") as myfile:
    for line in myfile:
        do_something(line.rstrip("\n"))

or, if you're not on Python 2.6 and higher:

myfile = open("data.txt")
for line in myfile:
    do_something(line.rstrip("\n"))

In both cases, you'll get an iterator that can be treated much like a list of strings.

EDIT: Since your way of reading the entire file into one large string and then splitting it on newlines will remove the newlines in the process, I have added a .rstrip("\n") to my examples in order to better simulate the result.

like image 75
Tim Pietzcker Avatar answered Oct 16 '22 12:10

Tim Pietzcker


use this code to read file line by line:

for line in open('data.txt'):
    # work with line
like image 39
SilentGhost Avatar answered Oct 16 '22 12:10

SilentGhost