Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python DataFrame or list for storing objects

Tags:

Can I "store" instances of class in pandas/numpy Series-DataFrame/ndarray just like I do in list? Or these libraries support on built-in types (numerics, strings).

For example I have Point with x,y coordinates, and I want to store Points in Plane, that would return Point with given coordinates.

#my class class MyPoint:      def __init__(self, x,y):         self.x = x         self.y = y      @property     def x(self):         return self.x      @property     def y(self):         return self.y 

Here I create instances:

first_point = MyClass(1,1) second_point = MyClass(2,2) 

I can store instances in some list

my_list = [] my_list.append(first_point) my_list.append(second_point) 

The problem in list is that it's indexes do not correspond to x,y properties.

Dictionary/DataFrame approach:

Plane = {"x" : [first_point.x, second_point.x], "y" : [first_point.y, second_point.y], "some_reference/id_to_point_instance" = ???} Plane_pd = pd.DataFrame(Plane) 

I've read posts, that using "id" of instance as third column value in DataFrame could cause problems with the garbage collector.

like image 547
Demaunt Avatar asked May 27 '17 16:05

Demaunt


People also ask

Can you store objects in DataFrame?

DataFrame will gladly store python objects.

Is pandas DataFrame faster than list?

Results. From the above, we can see that for summation, the DataFrame implementation is only slightly faster than the List implementation. This difference is much more pronounced for the more complicated Haversine function, where the DataFrame implementation is about 10X faster than the List implementation.

What is the difference between DataFrame and list in Python?

DataFrames are generic data objects of R which are used to store the tabular data. They are two-dimensional, heterogeneous data structures. A list in R, however, comprises of elements, vectors, data frames, variables, or lists that may belong to different data types.

Can DataFrame be store in list Python?

You can insert a list of values into a cell in Pandas DataFrame using DataFrame.at() , DataFrame. iat() , and DataFrame. loc() methods.


1 Answers

A pandas.DataFrame will gladly store python objects.

Some test code to demonstrate...

Test Code:

class MyPoint:     def __init__(self, x, y):         self._x = x         self._y = y      @property     def x(self):         return self._x      @property     def y(self):         return self._y  my_list = [MyPoint(1, 1), MyPoint(2, 2)] print(my_list)  plane_pd = pd.DataFrame([[p.x, p.y, p] for p in my_list],                         columns=list('XYO')) print(plane_pd.dtypes) print(plane_pd) 

Results:

[<__main__.MyPoint object at 0x033D2AF0>, <__main__.MyPoint object at 0x033D2B10>]  X     int64 Y     int64 O    object dtype: object     X  Y                                        O 0  1  1  <__main__.MyPoint object at 0x033D2AF0> 1  2  2  <__main__.MyPoint object at 0x033D2B10> 

Notes:

Note the two object in the list are the same two objects in the dataframe. Also note the dtype for the O column is object.

like image 149
Stephen Rauch Avatar answered Sep 30 '22 05:09

Stephen Rauch