pandas: Composition for chained methods like .resample(), .rolling() etc

Tags:

I would like to construct an extension of pandas.DataFrame — let's call it SPDF — which could do stuff above and beyond what a simple DataFrame can:

import pandas as pd
import numpy as np


def to_spdf(func):
    """Transform generic output of `func` to SPDF.

    Returns
    -------
    wrapper : callable
    """
    def wrapper(*args, **kwargs):
        res = func(*args, **kwargs)
        return SPDF(res)

    return wrapper


class SPDF:
    """Special-purpose dataframe.

    Parameters
    ----------
    df : pandas.DataFrame

    """

    def __init__(self, df):
        self.df = df

    def __repr__(self):
        return repr(self.df)

    def __getattr__(self, item):
        res = getattr(self.df, item)

        if callable(res):
            res = to_spdf(res)

        return res


if __name__ == "__main__":

    # construct a generic SPDF
    df = pd.DataFrame(np.eye(4))
    an_spdf = SPDF(df)

    # call .diff() to obtain another SPDF
    print(an_spdf.diff())

Right now, methods of DataFrame that return another DataFrame, such as .diff() in the MWE above, return me another SPDF, which is great. However, I would also like to trick chained methods such as .resample('M').last() or .rolling(2).mean() into producing an SPDF in the very end. I have failed so far because .rolling() and the like are of type callable, and my wrapper to_spdf tries to construct an SPDF from their output without 'waiting' for .mean() or any other last part of the expression. Any ideas how to tackle this problem?

Thanks.

666

asked Jul 11 '18 07:07

Igor Pozdeev

1 Answers

You should be properly subclassing dataframe. In order to get copy-constructor methods to work, pandas describes that you must set the _constructor property (along with other information).

You could do something like the following:

class SPDF(DataFrame):

    @property
    def _constructor(self):
        return SPDF

If you need to preserve custom attributes (not functions - those will be there), during copy-constructor methods (like diff), then you can do something like the following

class SPDF(DataFrame):
    _metadata = ['prop']
    prop = 1

    @property
    def _constructor(self):
        return SPDF

Notice the output is as desired:

df = SPDF(np.eye(4))
print(type(df))
[<class '__main__.SPDF'>]
new = df.diff()
print(type(new))
[<class '__main__.SPDF'>]

167

answered Oct 26 '22 23:10

modesitt

Related questions
                            
                                How to remove a residual plot in Jupyter output after displaying a matplotlib animation?
                            
                                Extract Characters using convex Hull coordinates - opencv - python
                            
                                flask-marshmallow: how to mark all fields as optional only when the method is PUT
                            
                                Why does augmented assignment behave differently when adding a string to a list [duplicate]
                            
                                Throwing ZeroDivisionError
                            
                                paste0 like function in python for multiple strings
                            
                                How to get the wikipedia corpus text with punctuation by using gensim wikicorpus?
                            
                                Three-way comparing strings in Python 3
                            
                                Numpy multiply 3d matrix by 2d matrix
                            
                                Generate combinations of values from rolling window in Pandas
                            
                                Is there a way to call await directly in Jupyter cell?
                            
                                Inference with a model trained with tf.Dataset
                            
                                Django- Change Username field to BigAutoField?
                            
                                Import binary package from different directory
                            
                                How is the usage of @classmethod causing difference in outputs?
                            
                                Why is eval('"\x27"') == eval('"\\x27"')?
                            
                                python template with default value
                            
                                Best way to process a click stream to create features in Pandas
                            
                                How does Python know two string variables point to the same object? [duplicate]
                            
                                MongoEngine - Another user is already authenticated to this database. You must logout first

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

pandas: Composition for chained methods like .resample(), .rolling() etc

Tags:

python

pandas

chained

object-composition

Igor Pozdeev

People also ask

1 Answers

modesitt

Recent Activity

Donate For Us