Is there a way to cache python file handles? I have a function which takes a netCDF file path as input, opens it, extracts some data from the netCDF file and closes it. It gets called a lot of times, and the overhead of opening the file each time is high.
How can I make it faster by maybe caching the file handle? Perhaps there is a python library to do this
Yes, you can use following python libraries:
Let's follow the example. You have two files:
# save.py - it puts deserialized file handler object to memcached
import dill
import memcache
mc = memcache.Client(['127.0.0.1:11211'], debug=0)
file_handler = open('data.txt', 'r')
mc.set("file_handler", dill.dumps(file_handler))
print 'saved!'
and
# read_from_file.py - it gets deserialized file handler object from memcached,
# then serializes it and read lines from it
import dill
import memcache
mc = memcache.Client(['127.0.0.1:11211'], debug=0)
file_handler = dill.loads(mc.get("file_handler"))
print file_handler.readlines()
Now if you run:
python save.py
python read_from_file.py
you can get what you want.
Why it works?
Because you didn't close the file (file_handler.close()
), so object still exist in memory (has not been garbage collected, because of weakref) and you can use it. Even in different process.
Solution
import dill
import memcache
mc = memcache.Client(['127.0.0.1:11211'], debug=0)
serialized = mc.get("file_handler")
if serialized:
file_handler = dill.loads(serialized)
else:
file_handler = open('data.txt', 'r')
mc.set("file_handler", dill.dumps(file_handler))
print file_handler.readlines()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With