How can I prevent stack from sorting indices?

Tags:

I have a test dataframe:

df1 = pd.DataFrame({
    "Group1": ["X", "Y", "Y", "X", "Y", "Z", "X", "Y"],
    "Group2": ["A", "C", "A", "B", "C", "C", "B", "A"],
    "Number1": [1, 3, 5, 1, 5, 2, 5, 3],
    "Number2": [6, 2, 6, 2, 7, 2, 6, 8],
})
df2 = df1.pivot_table(index="Group1", columns="Group2", margins=True)
print(df2)

Output:


           Number1                       Number2                         
Group2       A    B         C       All         A    B         C       All
Group1                                                                    
X          1.0  3.0       NaN  2.333333  6.000000  4.0       NaN  4.666667
Y          4.0  NaN  4.000000  4.000000  7.000000  NaN  4.500000  5.750000
Z          NaN  NaN  2.000000  2.000000       NaN  NaN  2.000000  2.000000
All        3.0  3.0  3.333333  3.125000  6.666667  4.0  3.666667  4.875000

When I call stack on this dataframe, I get this result:

df3 = df2.stack()
print(df3)

Output:

                Number1   Number2
Group1 Group2                    
X      A       1.000000  6.000000
       All     2.333333  4.666667
       B       3.000000  4.000000
Y      A       4.000000  7.000000
       All     4.000000  5.750000
       C       4.000000  4.500000
Z      All     2.000000  2.000000
       C       2.000000  2.000000
All    A       3.000000  6.666667
       All     3.125000  4.875000
       B       3.000000  4.000000
       C       3.333333  3.666667

How can I prevent stack from sorting the indices so that the order of Group2 remains as A, B, C, All?

355

asked Jul 19 '20 20:07

RayCurse

1 Answers

IIUC, We need pd.Index.get_level_values and DataFrame.reindex

df2.stack().reindex(df2.columns.get_level_values(1).unique(), level='Group2')

                Number1   Number2
Group1 Group2                    
X      A       1.000000  6.000000
       B       3.000000  4.000000
       All     2.333333  4.666667
Y      A       4.000000  7.000000
       C       4.000000  4.500000
       All     4.000000  5.750000
Z      C       2.000000  2.000000
       All     2.000000  2.000000
All    A       3.000000  6.666667
       B       3.000000  4.000000
       C       3.333333  3.666667
       All     3.125000  4.875000

We can use level='Group2' or level=1

answered Oct 12 '22 03:10

ansev

Related questions
                            
                                Python quick question about comprehensions vs list comprehensions
                            
                                Python write both commands and their output to a file
                            
                                Running Apache Beam python pipelines in Kubernetes
                            
                                Efficient way to shuffle lists within a list in Python
                            
                                Airflow running python files fails due to python: can't open file
                            
                                TypeError: cannot pickle '_thread.lock' object with RQ
                            
                                ModuleNotFoundError: No module named 'tensorflow.contrib' with tensorflow=2.0.0
                            
                                How to set hour range and minute interval using APScheduler
                            
                                How do I import something from a nested child directory with Python?
                            
                                Finding out the missing value in dataframe based on a column
                            
                                How to configure Celery Worker and Beat for Email Reporting in Apache Superset running on Docker?
                            
                                pyspark: arrays_zip equivalent in Spark 2.3
                            
                                python: yield inside map function
                            
                                Unable to fetch all the links from a webpage using requests
                            
                                Download file/folder from Public AWS S3 with Python, no credentials
                            
                                How to change max_iter in optimize function used by sklearn gaussian process regression?
                            
                                Problems understanding linear regression model tuning in tf.keras
                            
                                Interesting performance of creating objects via normal class, data class and named tuple
                            
                                Wrong address model when compiling boost

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How can I prevent stack from sorting indices?

Tags:

python

pandas

dataframe

RayCurse

People also ask

1 Answers

ansev

Recent Activity

Donate For Us