Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert a Pandas DataFrame into a list of objects

I want to convert a Pandas DataFrame into a list of objects.

This is my class:

class Reading:

    def __init__(self):
        self.HourOfDay: int = 0
        self.Percentage: float = 0

I read up on .to_dict, so I tried

df.to_dict(into=Reading)

but it returned

TypeError: unsupported type

I don't want a list of tuples, or a list of dicts, but a list of Readings. Every question I've found so far seems to be about these two scenarios. But I want my own typed objects.

Thanks

like image 841
zola25 Avatar asked Nov 07 '18 15:11

zola25


People also ask

How do you convert a DataFrame to a list of objects?

# Converting dataframe into a list. List = dataFrame. values. tolist()

Can we convert DataFrame to list?

In Python tolist() function is used to convert a DataFrame to a list and this function can be used in the below example and convert the required DataFrame into a list of strings. This function always returns a list of the values.

How do you turn a column of a DataFrame into a list?

values. tolist() you can convert pandas DataFrame Column to List. df['Courses'] returns the DataFrame column as a Series and then use values. tolist() to convert the column values to list.

What is Tolist () in Pandas?

Pandas series can be converted to a list using tolist() or type casting method. There can be situations when you want to perform operations on a list instead of a pandas object. In such cases, you can store the DataFrame columns in a list and perform the required operations.


3 Answers

Option 1: make Reading inherit from collections.MutableMapping and implement the necessary methods of that base class. Seems like a lot of work.

Option 2: Call Reading() in a list comprehension:

>>> import pandas as pd
>>> 
>>> df = pd.DataFrame({
...     'HourOfDay': [5, 10],
...     'Percentage': [0.25, 0.40]
... })
>>> 
>>> class Reading(object):
...     def __init__(self, HourOfDay: int = 0, Percentage: float = 0):
...         self.HourOfDay = int(HourOfDay)
...         self.Percentage = Percentage
...     def __repr__(self):
...         return f'{self.__class__.__name__}> (hour {self.HourOfDay}, pct. {self.Percentage})'
... 
>>> 
>>> readings = [Reading(**kwargs) for kwargs in df.to_dict(orient='records')]
>>> 
>>> 
>>> readings
[Reading> (hour 5, pct. 0.25), Reading> (hour 10, pct. 0.4)]

From docs:

into: The collections.Mapping subclass used for all Mappings in the return value. Can be the actual class or an empty instance of the mapping type you want. If you want a collections.defaultdict, you must pass it initialized.

like image 180
Brad Solomon Avatar answered Oct 23 '22 04:10

Brad Solomon


having data frame with two column HourOfDay and Percentage, and parameterized constructor of your class you could define a list of Object like this:

 class Reading:

   def __init__(self, h, p):
       self.HourOfDay = h 
       self.Percentage = p 

 listOfReading= [(Reading(row.HourOfDay,row.Percentage)) for index, row in df.iterrows() ]  
like image 17
NargesooTv Avatar answered Oct 23 '22 03:10

NargesooTv


It would probably be better to initialise the class with arguments, as follows:

 class Reading:
   def __init__(self, h, p):
       self.HourOfDay = h 
       self.Percentage = p 

Then, to create a list of reading, you could use this function, that takes the DataFrame as an argument:

 def reading_list(df:pd.DataFrame)->list:
    return list(map(lambda x:Reading(h=x[0],p=x[1]),df.values.tolist()))

Execution is fast, even with a large dataset.

like image 6
Victor Guillaud Avatar answered Oct 23 '22 02:10

Victor Guillaud