Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create MultiIndex pandas DataFrame from dictionary with tuple keys

I'd like to efficiently create a pandas DataFrame from a Python collections.Counter dictionary .. but there's an additional requirement.

The Counter dictionary looks like this:

(a, b) : 5
(c, d) : 7
(a, d) : 2

Those dictionary keys are tuples where the first is to become the row, and the second the column of the dataframe.

The resulting DataFrame should look like this:

   b  d
a  5  2
c  0  7

For larger data I don't want to create a dataframe using the growth method df[a][b]= 5 etc as that is incredibly inefficient as it creates a copy of the new dataframe every time such an extension is done (I'm let to believe).

Perhaps the right answer is to go via a numpy array?

like image 257
Intuitive Text Mining Avatar asked Jan 18 '19 17:01

Intuitive Text Mining


People also ask

Can we create pandas DataFrame using dictionary of tuples?

So we can use strings, numbers (int or float), or tuples as keys. Values can be of any type. We can also pass a dictionary to the dataframe function. The keys will be column names and the values will form the columns.

How do you convert a nested dictionary to a DataFrame in Python?

We first take the list of nested dictionary and extract the rows of data from it. Then we create another for loop to append the rows into the new list which was originally created empty. Finally we apply the DataFrames function in the pandas library to create the Data Frame.

How do I create a MultiIndex column in pandas?

pandas MultiIndex to ColumnsUse pandas DataFrame. reset_index() function to convert/transfer MultiIndex (multi-level index) indexes to columns. The default setting for the parameter is drop=False which will keep the index values as columns and set the new index to DataFrame starting from zero.

Can we create DataFrame from dictionary of lists?

It is the most commonly used pandas object. Creating pandas data-frame from lists using dictionary can be achieved in multiple ways. Let's discuss different ways to create a DataFrame one by one. With this method in Pandas, we can transform a dictionary of lists into a dataframe.


1 Answers

Using Series with unstack

pd.Series(d).unstack(fill_value=0)
Out[708]: 
   b  d
a  5  2
c  0  7

Input data

d={('a', 'b') : 5,
('c', 'd') : 7,
('a', 'd') : 2}
like image 155
BENY Avatar answered Oct 05 '22 17:10

BENY