Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating a class based on Pandas.DataFrame using the pandas.read_csv() function to initialize

My goal is to create an object that behaves the same as a Pandas DataFrame, but with a few extra methods of my own on top of it. As far as I understand, one approach would be to extend the class, which I first tried to do as follows:

class CustomDF(pd.DataFrame):
    def  __init__(self, filename):
        self = pd.read_csv(filename)

But I get errors when trying to view this object, saying: 'CustomDF' object has no attribute '_data'.

My second iteration was to instead not inherit the object, but rather import it as a DataFrame into one of the object attributes, and have the methods work around it, like this:

class CustomDF():

    def  __init__(self, filename):
        self.df = pd.read_csv(filename)

    def custom_method_1(self,a,b,...):
        ...

    def custom_method_2(self,a,b,...):
        ...

This is fine, except that for all custom methods, I need to access the self.df attribute first to do anything on it, but I would prefer that my custom dataframe were just self.

Is there a way that this can be done? Or is this approach not ideal anyway?

like image 810
teepee Avatar asked Nov 29 '18 17:11

teepee


People also ask

What is the data type of the object created using Pandas read_csv () function?

In this case, the Pandas read_csv() function returns a new DataFrame with the data and labels from the file data. csv , which you specified with the first argument. This string can be any valid path, including URLs.

How do I create a CSV from a Pandas DataFrame?

By using pandas. DataFrame. to_csv() method you can write/save/export a pandas DataFrame to CSV File. By default to_csv() method export DataFrame to a CSV file with comma delimiter and row index as the first column.

Does PD read_csv create a DataFrame?

Pandas read_csv() function imports a CSV file to DataFrame format. header: this allows you to specify which row will be used as column names for your dataframe. Expected an int value or a list of int values.

What does CSV in read_csv () stand for?

A comma-separated values (csv) file is returned as two-dimensional data structure with labeled axes. See also DataFrame.to_csv. Write DataFrame to a comma-separated values (csv) file. read_csv. Read a comma-separated values (csv) file into DataFrame.


1 Answers

The __init__ method is overwritten in your first example.

Use super and then add your custom code

class CustomDF(pd.DataFrame):
    def __init__(self, *args, **kw):
        super(CustomDF, self).__init__(*args, **kw)
        # Your code here

    def custom_method_1(self,a,b,...):
        ...
like image 101
Kamil Niski Avatar answered Sep 24 '22 03:09

Kamil Niski