I'm on Windows platform and using Python 3. Since the default behavior of file readers is to consume file line by line, I have difficulty dealing with my 100GB text file which has only one line.
I'm aware of solutions such as this for introducing a custom record separator for replacing a frequent character with \n
; but I wonder is there anyway that I could consume and process my file only via Python?
I have only 8GB of ram. My file is the records of sales (including item, price, buyer, ...). My processing of the file is mostly editing price numbers. Records are separated from each other using |
character.
# !/usr/bin/python3
import os, sys
# Open a file
fd = os.open("foo.txt",os.O_RDWR)
# Reading text
ret = os.read(fd,12)
print (ret.decode())
# Close opened file
os.close(fd)
print ("Closed the file successfully!!")
or
with open(filename, 'rb') as f:
while True:
buf = f.read(max_size)
if not buf:
break
process(buf)
or
from functools import partial
with open('somefile', 'rb') as openfileobject:
for chunk in iter(partial(openfileobject.read, 1024), b''):
do_something()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With