Can I "store" instances of class in pandas/numpy Series-DataFrame/ndarray just like I do in list? Or these libraries support on built-in types (numerics, strings).
For example I have Point
with x,y
coordinates, and I want to store Points
in Plane
, that would return Point
with given coordinates.
#my class class MyPoint: def __init__(self, x,y): self.x = x self.y = y @property def x(self): return self.x @property def y(self): return self.y
Here I create instances:
first_point = MyClass(1,1) second_point = MyClass(2,2)
I can store instances in some list
my_list = [] my_list.append(first_point) my_list.append(second_point)
The problem in list is that it's indexes do not correspond to x,y properties.
Dictionary/DataFrame approach:
Plane = {"x" : [first_point.x, second_point.x], "y" : [first_point.y, second_point.y], "some_reference/id_to_point_instance" = ???} Plane_pd = pd.DataFrame(Plane)
I've read posts, that using "id" of instance as third column value in DataFrame could cause problems with the garbage collector.
DataFrame will gladly store python objects.
Results. From the above, we can see that for summation, the DataFrame implementation is only slightly faster than the List implementation. This difference is much more pronounced for the more complicated Haversine function, where the DataFrame implementation is about 10X faster than the List implementation.
DataFrames are generic data objects of R which are used to store the tabular data. They are two-dimensional, heterogeneous data structures. A list in R, however, comprises of elements, vectors, data frames, variables, or lists that may belong to different data types.
You can insert a list of values into a cell in Pandas DataFrame using DataFrame.at() , DataFrame. iat() , and DataFrame. loc() methods.
A pandas.DataFrame
will gladly store python objects.
Some test code to demonstrate...
class MyPoint: def __init__(self, x, y): self._x = x self._y = y @property def x(self): return self._x @property def y(self): return self._y my_list = [MyPoint(1, 1), MyPoint(2, 2)] print(my_list) plane_pd = pd.DataFrame([[p.x, p.y, p] for p in my_list], columns=list('XYO')) print(plane_pd.dtypes) print(plane_pd)
[<__main__.MyPoint object at 0x033D2AF0>, <__main__.MyPoint object at 0x033D2B10>] X int64 Y int64 O object dtype: object X Y O 0 1 1 <__main__.MyPoint object at 0x033D2AF0> 1 2 2 <__main__.MyPoint object at 0x033D2B10>
Note the two object in the list are the same two objects in the dataframe. Also note the dtype for the O
column is object
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With