I want to convert a Pandas DataFrame into a list of objects.
This is my class:
class Reading:
def __init__(self):
self.HourOfDay: int = 0
self.Percentage: float = 0
I read up on .to_dict, so I tried
df.to_dict(into=Reading)
but it returned
TypeError: unsupported type
I don't want a list of tuples, or a list of dicts, but a list of Readings. Every question I've found so far seems to be about these two scenarios. But I want my own typed objects.
Thanks
# Converting dataframe into a list. List = dataFrame. values. tolist()
In Python tolist() function is used to convert a DataFrame to a list and this function can be used in the below example and convert the required DataFrame into a list of strings. This function always returns a list of the values.
values. tolist() you can convert pandas DataFrame Column to List. df['Courses'] returns the DataFrame column as a Series and then use values. tolist() to convert the column values to list.
Pandas series can be converted to a list using tolist() or type casting method. There can be situations when you want to perform operations on a list instead of a pandas object. In such cases, you can store the DataFrame columns in a list and perform the required operations.
Option 1: make Reading
inherit from collections.MutableMapping
and implement the necessary methods of that base class. Seems like a lot of work.
Option 2: Call Reading()
in a list comprehension:
>>> import pandas as pd
>>>
>>> df = pd.DataFrame({
... 'HourOfDay': [5, 10],
... 'Percentage': [0.25, 0.40]
... })
>>>
>>> class Reading(object):
... def __init__(self, HourOfDay: int = 0, Percentage: float = 0):
... self.HourOfDay = int(HourOfDay)
... self.Percentage = Percentage
... def __repr__(self):
... return f'{self.__class__.__name__}> (hour {self.HourOfDay}, pct. {self.Percentage})'
...
>>>
>>> readings = [Reading(**kwargs) for kwargs in df.to_dict(orient='records')]
>>>
>>>
>>> readings
[Reading> (hour 5, pct. 0.25), Reading> (hour 10, pct. 0.4)]
From docs:
into
: The collections.Mapping subclass used for all Mappings in the return value. Can be the actual class or an empty instance of the mapping type you want. If you want a collections.defaultdict, you must pass it initialized.
having data frame with two column HourOfDay and Percentage, and parameterized constructor of your class you could define a list of Object like this:
class Reading:
def __init__(self, h, p):
self.HourOfDay = h
self.Percentage = p
listOfReading= [(Reading(row.HourOfDay,row.Percentage)) for index, row in df.iterrows() ]
It would probably be better to initialise the class with arguments, as follows:
class Reading:
def __init__(self, h, p):
self.HourOfDay = h
self.Percentage = p
Then, to create a list of reading, you could use this function, that takes the DataFrame as an argument:
def reading_list(df:pd.DataFrame)->list:
return list(map(lambda x:Reading(h=x[0],p=x[1]),df.values.tolist()))
Execution is fast, even with a large dataset.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With