I have some way of building a data structure (out of some file contents, say):
def loadfile(FILE):
return # some data structure created from the contents of FILE
So I can do things like
puppies = loadfile("puppies.csv") # wait for loadfile to work
kitties = loadfile("kitties.csv") # wait some more
print len(puppies)
print puppies[32]
In the above example, I wasted a bunch of time actually reading kitties.csv
and creating a data structure that I never used. I'd like to avoid that waste without constantly checking if not kitties
whenever I want to do something. I'd like to be able to do
puppies = lazyload("puppies.csv") # instant
kitties = lazyload("kitties.csv") # instant
print len(puppies) # wait for loadfile
print puppies[32]
So if I don't ever try to do anything with kitties
, loadfile("kitties.csv")
never gets called.
Is there some standard way to do this?
After playing around with it for a bit, I produced the following solution, which appears to work correctly and is quite brief. Are there some alternatives? Are there drawbacks to using this approach that I should keep in mind?
class lazyload:
def __init__(self,FILE):
self.FILE = FILE
self.F = None
def __getattr__(self,name):
if not self.F:
print "loading %s" % self.FILE
self.F = loadfile(self.FILE)
return object.__getattribute__(self.F, name)
What might be even better is if something like this worked:
class lazyload:
def __init__(self,FILE):
self.FILE = FILE
def __getattr__(self,name):
self = loadfile(self.FILE) # this never gets called again
# since self is no longer a
# lazyload instance
return object.__getattribute__(self, name)
But this doesn't work because self
is local. It actually ends up calling loadfile
every time you do anything.
The csv module in the Python stdlibrary will not load the data until you start iterating over it, so it is in fact lazy.
Edit: If you need to read through the whole file to build the datastructure, having a complex Lazy load object that proxies things is overkill. Just do this:
class Lazywrapper(object):
def __init__(self, filename):
self.filename = filename
self._data = None
def get_data(self):
if self._data = None:
self._build_data()
return self._data
def _build_data(self):
# Now open and iterate over the file to build a datastructure, and
# put that datastructure as self._data
With the above class you can do this:
puppies = Lazywrapper("puppies.csv") # Instant
kitties = Lazywrapper("kitties.csv") # Instant
print len(puppies.getdata()) # Wait
print puppies.getdata()[32] # instant
Also
allkitties = kitties.get_data() # wait
print len(allkitties)
print kitties[32]
If you have a lot of data, and you don't really need to load all the data you could also implement something like class that will read the file until it finds the doggie called "Froufrou" and then stop, but at that point it's likely better to stick the data in a database once and for all and access it from there.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With