Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sorting Pandas Dataframe by order of another index

Tags:

python

pandas

Say I have two dataframes, df1 and df2 that share the same index. df1 is sorted in the order that I want df2 to be sorted.

df=pd.DataFrame(index=['Arizona','New Mexico', 'Colorado'],columns=['A','B','C'], data=[[1,2,3],[4,5,6],[7,8,9]]) print df              A  B  C Arizona     1  2  3 New Mexico  4  5  6 Colorado    7  8  9   df2=pd.DataFrame(index=['Arizona','Colorado', 'New Mexico'], columns=['D'], data=['Orange','Blue','Green']) print df2                  D Arizona     Orange Colorado      Blue New Mexico   Green 

What is the best / most efficient way of sorting the second dataframe by the index of the first?

One option is just joining them, sorting, and then dropping the columns:

df.join(df2)[['D']]                   D Arizona     Orange New Mexico   Green Colorado      Blue 

Is there a more elegant way of doing this?

Thanks!

like image 880
AJG519 Avatar asked Aug 05 '15 22:08

AJG519


People also ask

How do you sort a DataFrame based on an index?

To sort a Pandas DataFrame by index, you can use DataFrame. sort_index() method. To specify whether the method has to sort the DataFrame in ascending or descending order of index, you can set the named boolean argument ascending to True or False respectively. When the index is sorted, respective rows are rearranged.

What does sort_index do in Pandas?

Pandas Series: sort_index() function The sort_index() function is used to sort Series by index labels. Returns a new Series sorted by label if inplace argument is False, otherwise updates the original series and returns None.


1 Answers

reindex would work - be aware that it will create missing values for index values that are in df, but not in df2.

In [18]: df2.reindex(df.index) Out[18]:                   D Arizona     Orange New Mexico   Green Colorado      Blue 
like image 107
chrisb Avatar answered Sep 19 '22 11:09

chrisb