Say I have two dataframes, df1 and df2 that share the same index. df1 is sorted in the order that I want df2 to be sorted.
df=pd.DataFrame(index=['Arizona','New Mexico', 'Colorado'],columns=['A','B','C'], data=[[1,2,3],[4,5,6],[7,8,9]]) print df A B C Arizona 1 2 3 New Mexico 4 5 6 Colorado 7 8 9 df2=pd.DataFrame(index=['Arizona','Colorado', 'New Mexico'], columns=['D'], data=['Orange','Blue','Green']) print df2 D Arizona Orange Colorado Blue New Mexico Green
What is the best / most efficient way of sorting the second dataframe by the index of the first?
One option is just joining them, sorting, and then dropping the columns:
df.join(df2)[['D']] D Arizona Orange New Mexico Green Colorado Blue
Is there a more elegant way of doing this?
Thanks!
To sort a Pandas DataFrame by index, you can use DataFrame. sort_index() method. To specify whether the method has to sort the DataFrame in ascending or descending order of index, you can set the named boolean argument ascending to True or False respectively. When the index is sorted, respective rows are rearranged.
Pandas Series: sort_index() function The sort_index() function is used to sort Series by index labels. Returns a new Series sorted by label if inplace argument is False, otherwise updates the original series and returns None.
reindex
would work - be aware that it will create missing values for index values that are in df, but not in df2.
In [18]: df2.reindex(df.index) Out[18]: D Arizona Orange New Mexico Green Colorado Blue
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With