Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python, storing dictionaries inside a dataframe

I've built a pandas dataframe which is storing a simple dictionary in each cell. For example:

{'Sales':0,'Revenue':0}

I can retrieve a specific value from the dataframe via:

df[columnA][index100]['Revenue']

But now I'd like to plot a graph of all the Revenue values from the dictionaries in columnA - what is the best way of achieving this?

Would life be easier in the long run if I dropped the dictionaries and instead used two identically sized dataframes? (Am very new to pandas so not sure of best practice).

like image 440
Sylvansight Avatar asked May 12 '13 18:05

Sylvansight


People also ask

Can you store a dictionary in a Pandas DataFrame?

We can convert a dictionary to a pandas dataframe by using the pd. DataFrame. from_dict() class-method.

Can you put a dictionary in a DataFrame?

A pandas DataFrame can be converted into a Python dictionary using the DataFrame instance method to_dict(). The output can be specified of various orientations using the parameter orient. In dictionary orientation, for each column of the DataFrame the column value is listed against the row label in a dictionary.

Can you append a dictionary to a DataFrame Python?

You can align, filter, and group columns, pinpoint specific values, perform statistical analysis by calling a single function, and even add more data to a dataframe by appending dictionaries!

Can you store a dictionary in a file Python?

The most basic way to save dictionaries in Python would be to store them as strings in text files. This method would include the following steps: Opening a file in write/append text mode. Converting the dictionary into a string.


2 Answers

A simple way to get all the Revenue values from a column A is df[columnA].map(lambda v: v['Revenue']).

Depending on what you're doing, life may indeed be easier if you tweak your structure a bit. For instance, you could use a hierarchical index with "Sales" and "Revenue" as the keys in one level.

like image 120
BrenBarn Avatar answered Oct 20 '22 05:10

BrenBarn


For the majority of use cases you it's not a good idea to be storing dictionaries in DataFrame.
Another datastructure worth mentioning is a Panel.

Suppose you have something a DataFrame of dictionaries (with fairly consistent keys):

In [11]: df = pd.DataFrame([[{'a': 1, 'b': 2}, {'a': 3, 'b': 4}], [{'a': 5, 'b': 6}, {'a': 7, 'b': 8}]], columns=list('AB'))

In [12]: df
Out[12]:
                  A                 B
0  {'a': 1, 'b': 2}  {'a': 3, 'b': 4}
1  {'a': 5, 'b': 6}  {'a': 7, 'b': 8}

You can create a Panel (note there are more direct/preferable ways to construct this!):

In [13]: wp = pd.Panel({'A': df['A'].apply(pd.Series), 'B': df['B'].apply(pd.Series)})

In [14]: wp
Out[14]:
<class 'pandas.core.panel.Panel'>
Dimensions: 2 (items) x 2 (major_axis) x 2 (minor_axis)
Items axis: A to B
Major_axis axis: 0 to 1
Minor_axis axis: a to b

Sections of which can be accessed efficiently as DataFrames in a variety of ways, for example:

In [15]: wp.A
Out[15]:
   a  b
0  1  2
1  5  6

In [16]: wp.minor_xs('a')
Out[16]:
   A  B
0  1  3
1  5  7

In [17]: wp.major_xs(0)
Out[17]:
   A  B
a  1  3
b  2  4

So you can do all the pandas DataFrame whizziness:

In [18]: wp.A.plot()  # easy!
Out[18]: <matplotlib.axes.AxesSubplot at 0x1048342d0>

There are also ("experimental") higher dimensional Panels.

like image 23
Andy Hayden Avatar answered Oct 20 '22 04:10

Andy Hayden