Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python - extending properties like you'd extend a function

Question

How can you extend a python property?

A subclass can extend a super class's function by calling it in the overloaded version, and then operating on the result. Here's an example of what I mean when I say "extending a function":

# Extending a function (a tongue-in-cheek example)

class NormalMath(object):
    def __init__(self, number):
        self.number = number

    def add_pi(self):
        n = self.number
        return n + 3.1415


class NewMath(object):
    def add_pi(self):
        # NewMath doesn't know how NormalMath added pi (and shouldn't need to).
        # It just uses the result.
        n = NormalMath.add_pi(self)  

        # In NewMath, fractions are considered too hard for our users.
        # We therefore silently convert them to integers.
        return int(n)

Is there an analogous operation to extending functions, but for functions that use the property decorator?

I want to do some additional calculations immediately after getting an expensive-to-compute attribute. I need to keep the attribute's access lazy. I don't want the user to have to invoke a special routine to make the calculations. basically, I don't want the user to ever know the calculations were made in the first place. However, the attribute must remain a property, since i've got legacy code I need to support.

Maybe this is a job for decorators? If I'm not mistaken, decorator is a function that wraps another function, and I'm looking to wrap a property with some more calculations, and then present it as a property again, which seems like a similar idea... but I can't quite figure it out.

My Specific Problem

I've got a base class LogFile with an expensive-to-construct attribute .dataframe. I've implemented it as a property (with the property decorator), so it won't actually parse the log file until I ask for the dataframe. So far, it works great. I can construct a bunch (100+) LogFile objects, and use cheaper methods to filter and select only the important ones to parse. And whenever I'm using the same LogFile over and over, i only have to parse it the first time I access the dataframe.

Now I need to write a LogFile subclass, SensorLog, that adds some extra columns to the base class's dataframe attribute, but I can't quite figure out the syntax to call the super class's dataframe construction routines (without knowing anything about their internal workings), then operate on the resulting dataframe, and then cache/return it.

# Base Class - rules for parsing/interacting with data.
class LogFile(object):
    def __init__(self, file_name):
        # file name to find the log file
        self.file_name = file_name
        # non-public variable to cache results of parse()
        self._dataframe = None

    def parse(self):
        with open(self.file_name) as infile:
            ...
            ...
            # Complex rules to interpret the file 
            ...
            ...
        self._dataframe = pandas.DataFrame(stuff)

    @property
    def dataframe(self):
        """
        Returns the dataframe; parses file if necessary. This works great!

        """
        if self._dataframe is None:
            self.parse()
        return self._dataframe

    @dataframe.setter
    def dataframe(self,value):
        self._dataframe = value


# Sub class - adds more information to data, but does't parse
# must preserve established .dataframe interface
class SensorLog(LogFile):
    def __init__(self, file_name):
        # Call the super's constructor
        LogFile.__init__(self, file_name)

        # SensorLog doesn't actually know about (and doesn't rely on) the ._dataframe cache, so it overrides it just in case.
        self._dataframe = None

    # THIS IS THE PART I CAN'T FIGURE OUT
    # Here's my best guess, but it doesn't quite work:
    @property
    def dataframe(self):
        # use parent class's getter, invoking the hidden parse function and any other operations LogFile might do.
        self._dataframe = LogFile.dataframe.getter()    

        # Add additional calculated columns
        self._dataframe['extra_stuff'] = 'hello world!'
        return self._dataframe


    @dataframe.setter
    def dataframe(self, value):
        self._dataframe = value

Now, when these classes are used in an interactive session, the user should be able to interact with either in the same way.

>>> log = LogFile('data.csv')
>>> print log.dataframe
#### DataFrame with 10 columns goes here ####
>>> sensor = SensorLog('data.csv')
>>> print sensor.dataframe
#### DataFrame with 11 columns goes here ####

I have lots of existing code that takes a LogFile instance which provides a .dataframe attribute and dos something interesting (mostly plotting). I would LOVE to have SensorLog instances present the same interface so they can use the same code. Is it possible to extend the super-class's dataframe getter to take advantage of existing routines? How? Or am I better off doing this a different way?

Thanks for reading that huge wall of text. You are an internet super hero, dear reader. Got any ideas?

like image 353
Matt Merrifield Avatar asked Feb 18 '14 04:02

Matt Merrifield


3 Answers

You should be calling the superclass properties, not bypassing them via self._dataframe. Here's a generic example:

class A(object):

    def __init__(self):
        self.__prop = None

    @property
    def prop(self):
        return self.__prop

    @prop.setter
    def prop(self, value):
        self.__prop = value

class B(A):

    def __init__(self):
        super(B, self).__init__()

    @property
    def prop(self):
        value = A.prop.fget(self)
        value['extra'] = 'stuff'
        return value

    @prop.setter
    def prop(self, value):
        A.prop.fset(self, value)

And using it:

b = B()
b.prop = dict((('a', 1), ('b', 2)))
print(b.prop)

Outputs:

{'a': 1, 'b': 2, 'extra': 'stuff'}

I would generally recommend placing side-effects in setters instead of getters, like this:

class A(object):

    def __init__(self):
        self.__prop = None

    @property
    def prop(self):
        return self.__prop

    @prop.setter
    def prop(self, value):
        self.__prop = value

class B(A):

    def __init__(self):
        super(B, self).__init__()

    @property
    def prop(self):
        return A.prop.fget(self)

    @prop.setter
    def prop(self, value):
        value['extra'] = 'stuff'
        A.prop.fset(self, value)

Having costly operations within a getter is also generally to be avoided (such as your parse method).

like image 157
Dane White Avatar answered Oct 21 '22 18:10

Dane White


If I understand correctly what you want to do is call the parent's method from the child instance. The usual way to do that is by using the super built-in.

I've taken your tongue-in-cheek example and modified it to use super in order to show you:

class NormalMath(object):
    def __init__(self, number):
        self.number = number

    def add_pi(self):
        n = self.number
        return n + 3.1415


class NewMath(NormalMath):
    def add_pi(self):
        # this will call NormalMath's add_pi with
        normal_maths_pi_plus_num = super(NewMath, self).add_pi()
        return int(normal_maths_pi_plus_num)

In your Log example, instead of calling:

self._dataframe = LogFile.dataframe.getter() 

you should call:

self._dataframe = super(SensorLog, self).dataframe

You can read more about super here

Edit: Even thought the example I gave you deals with methods, to do the same with @properties shouldn't be a problem.

like image 24
kirbuchi Avatar answered Oct 21 '22 19:10

kirbuchi


You have some possibilities to consider:

1/ Inherit from logfile and override parse in your derived sensor class. It should be possible to modify your methods that work on dataframe to work regardless of the number of members that dataframe has - as you are using pandas a lot of it is done for you.

2/ Make sensor an instance of logfile then provide its own parse method.

3/ Generalise parse, and possibly some of your other methods, to use a list of data descriptors and possibly a dictionary of methods/rules either set in your class initialiser or set by a methods.

4/ Look at either making more use of the methods already in pandas, or possibly, extending pandas to provide the missing methods if you and others think that they would be accepted into pandas as useful extensions.

Personally I think that you would find the benefits of options 3 or 4 to be the most powerful.

like image 1
Steve Barnes Avatar answered Oct 21 '22 19:10

Steve Barnes