I have multiple files and I want to read them simultaneously, extract a number from each row and do the averages. For a small number of files I did this using izip in the itertools module. Here is my code.
from itertools import izip
import math
g=open("MSDpara_ave_nvt.dat",'w')
with open("sample1/err_msdCECfortran_nvt.dat",'r') as f1, \
open("sample2/err_msdCECfortran_nvt.dat",'r') as f2, \
open("sample3/err_msdCECfortran_nvt.dat",'r') as f3, \
open("err_msdCECfortran_nvt.dat",'r') as f4:
for x,y,z,bg in izip(f1,f2,f3,f4):
args1=x.split()
i1 = float(args1[0])
msd1 = float(args1[1])
args2=y.split()
i2 = float(args2[0])
msd2 = float(args2[1])
args3=z.split()
i3 = float(args3[0])
msd3 = float(args3[1])
args4=bg.split()
i4 = float(args4[0])
msd4 = float(args4[1])
msdave = (msd1 + msd2 + msd3 + msd4)/4.0
print>>g, "%e %e" %(i1, msdave)
f1.close()
f2.close()
f3.close()
f4.close()
g.close()
This code works OK. But if I want to handle 100 files simultaneously, the code becomes very lengthy if I do it in this way. Are there any other simpler ways of doing this? It seems that fileinput module can also handle multiple files, but I don't know if it can do it simultaneously.
Thanks.
The with open pattern is good, but in this case it gets in your way. You can open a list of files, then use that list inside izip:
filenames = ["sample1/err_msdCECfortran_nvt.dat",...]
files = [open(i, "r") for i in filenames]
for rows in izip(*files):
# rows is now a tuple containing one row from each file
In Python 3.3+ you can also use ExitStack in a with block:
filenames = ["sample1/err_msdCECfortran_nvt.dat",...]
with ExitStack() as stack:
files = [stack.enter_context(open(i, "r")) for i in filenames]
for rows in zip(*files):
# rows is now a tuple containing one row from each file
In Python < 3.3, to use with with all its advantages (e.g. timely closing no matter how you exit the block), you would need to create your own context manager:
class FileListReader(object):
def init(self, filenames):
self.files = [open(i, "r") for i in filenames]
def __enter__(self):
for i in files:
i.__enter__()
return self
def __exit__(self, exc_type, exc_value, traceback):
for i in files:
i.__exit__(exc_type, exc_value, traceback)
Then you could do:
filenames = ["sample1/err_msdCECfortran_nvt.dat",...]
with FileListReader(filenames) as f:
for rows in izip(*f.files):
#...
In this case the last might be considered over-engineering, though.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With