I have a 5gb text file and i am trying to read it line by line. My file is in format-: Reviewerid<\t>pid<\t>date<\t>title<\t>body<\n> This is my code
o = open('mproducts.txt','w')
with open('reviewsNew.txt','rb') as f1:
for line in f1:
line = line.strip()
line2 = line.split('\t')
o.write(str(line))
o.write("\n")
But i get Memory error when i try to run it. I have an 8gb ram and 1Tb space then why am i getting this error? I tried to read it in blocks but then also i get that error.
MemoryError
Update:
Installing 64 bit Python solves the issue.
OP was using 32 bit Python that's why getting into memory limitation.
Reading whole comments I think this can help you.
Summary : Get N lines at time, process it and then write it.
Sample Code :
from itertools import islice
#You can change num_of_lines
def get_lines(file_handle,num_of_lines = 10):
while True:
next_n_lines = list(islice(file_handle, num_of_lines))
if not next_n_lines:
break
yield next_n_lines
o = open('mproducts.txt','w')
with open('reviewsNew.txt','r') as f1:
for data_lines in get_lines(f1):
for line in data_lines:
line = line.strip()
line2 = line.split('\t')
o.write(str(line))
o.write("\n")
o.close()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With