Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to construct a Pandas Series which auto-interpolates?

Tags:

python

pandas

Is it possible to produce a series which interpolates its value, for any given index. I have a predefined interpolation scheme I wish to prescribe and I'd rather the caller didn't apply the interpolation themselves, to avoid any possibilities of error.

class InterpolatedSeries(pd.Series):
    pass # magic?

s = pd.Series([1, 3], index=[1, 3])
i = InterpolatedSeries(s, forward='nearest', backward='nearest', middle='linear')

The caller would receive i as a result and they could now request any value, and I'd be confident the value they got conformed the prescribed interpolation scheme. The interpolation would certainly not be pre-computable (because we don't know which points they'll request ahead of time) or cacheable (because we don't know how many points they'll ask for), but importantly there no complications for the caller.

Is this possible?

>>> i[[0, 0.11234, 1, 2, 2.367, 3, 4]]
... pd.Series([1, 1, 1, 2, 2.367, 3, 3], index=[0, 0.11234, 1, 2, 2.367, 3, 4])
like image 787
poulter7 Avatar asked Dec 01 '16 20:12

poulter7


1 Answers

Use __getitem__. It is called a python magic method http://www.diveintopython3.net/special-method-names.html

class InterpolatedSeries(pd.Series):
    def __init__(self, values, forward='nearest', backward='nearest', middle='linear'):
        super().__init__(values)
        self.forward = forward
        self.backward = backward
        self.middle = middle

    def __getitem__(self, key):
        # get the stored values
        values = super().__getitem__(key)
        # Do interpolation
        return values

or

class InterpolatedSeries(pd.Series):
    def __init__(self, values, forward='nearest', backward='nearest', middle='linear'):
        super().__init__(values)
        self.forward = forward
        self.backward = backward
        self.middle = middle

    def __setitem__(self, key, value):
        # Do interpolation
        super().__setitem__(key, value)

Another alternative would be to create you own class that interacts with an underlying data structure. This class would not inherit from pd.Series, but an object instead.

class InterpolatedSeries(object):
    def __init__(self, values, forward='nearest', backward='nearest', middle='linear'):
        self.data = values
        self.forward = forward
        self.backward = backward
        self.middle = middle

    def __getitem__(self, key):
        values = self.data.__getitem__(key)
        # Do interpolation
        return values

    def __getattribute__(self, key): # maybe __getattr__ if this doesn't work
        """Return the stored pandas series item if the method or attribute was not found. This allows your to_csv method to work"""
        try:
            return super().__getattribute__(key)
        except AttributeError:
            pass
        return self.data.__getattribute__(key) # Call the stored pandas series method if not found.

    def __dir__(self):
        """Return the list of attributes. (Most code autocomplete features use this, so this will find your pandas series methods for autocomplete in IDEs). """
        values = dir(self.data)
        return values + super().__dir__()

The above is probably not the best approach, but it does add for some flexibility by making it easier to access the pandas series methods in the background.

like image 144
justengel Avatar answered Nov 15 '22 05:11

justengel