Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert pandas group by object to multi-indexed Dataframe

If I have the following Dataframe

>>> df = pd.DataFrame({'Name': ['Bob'] * 3 + ['Alice'] * 3, \
'Destination': ['Athens', 'Rome'] * 3, 'Length': np.random.randint(1, 6, 6)}) 
>>> df    
  Destination  Length   Name
0      Athens       3    Bob
1        Rome       5    Bob
2      Athens       2    Bob
3        Rome       1  Alice
4      Athens       3  Alice
5        Rome       5  Alice

I can goup by name and destination...

>>> grouped = df.groupby(['Name', 'Destination'])
>>> for nm, gp in grouped:
>>>     print nm
>>>     print gp
('Alice', 'Athens')
  Destination  Length   Name
4      Athens       3  Alice
('Alice', 'Rome')
  Destination  Length   Name
3        Rome       1  Alice
5        Rome       5  Alice
('Bob', 'Athens')
  Destination  Length Name
0      Athens       3  Bob
2      Athens       2  Bob
('Bob', 'Rome')
  Destination  Length Name
1        Rome       5  Bob

but I would like a new multi-indexed dataframe out of it that looks something like

                Length
Alice   Athens       3
        Rome         1
        Rome         5
Bob     Athens       3
        Athens       2
        Rome         5

It seems there should be a way to do something like Dataframe(grouped) to get my multi-indexed Dataframe, but instead I get a PandasError ("DataFrame constructor not properly called!").

What is the easiest way to get this? Also, anyone know if there will ever be an option to pass a groupby object to the constructor, or if I'm just doing it wrong?

Thanks

like image 850
beardc Avatar asked Jan 13 '13 08:01

beardc


People also ask

How do you get index after Groupby pandas?

Overview: Create a dataframe using an dictionary. Group by item_id and find the max value. enumerate over the grouped dataframe and use the key which is an numeric value to return the alpha index value. Create an result_df dataframe if you desire.

How do I get multiple indexes in pandas?

Creating a MultiIndex (hierarchical index) object A MultiIndex can be created from a list of arrays (using MultiIndex. from_arrays() ), an array of tuples (using MultiIndex. from_tuples() ), a crossed set of iterables (using MultiIndex. from_product() ), or a DataFrame (using MultiIndex.

Can pandas do Groupby index?

How to perform groupby index in pandas? Pass index name of the DataFrame as a parameter to groupby() function to group rows on an index. DataFrame. groupby() function takes string or list as a param to specify the group columns or index.


1 Answers

Since you're not aggregating similarly indexed rows, try setting the index with a list of column names.

In [2]: df.set_index(['Name', 'Destination'])
Out[2]: 
                   Length
Name  Destination        
Bob   Athens            3
      Rome              5
      Athens            2
Alice Rome              1
      Athens            3
      Rome              5
like image 78
Garrett Avatar answered Oct 13 '22 05:10

Garrett