I know that there are ways to swap the column order in python pandas. Let say I have this example dataset:
import pandas as pd
employee = {'EmployeeID' : [0,1,2],
'FirstName' : ['a','b','c'],
'LastName' : ['a','b','c'],
'MiddleName' : ['a','b', None],
'Contact' : ['(M) 133-245-3123', '(F)[email protected]', '(F)312-533-2442 [email protected]']}
df = pd.DataFrame(employee)
The one basic way to do would be:
neworder = ['EmployeeID','FirstName','MiddleName','LastName','Contact']
df=df.reindex(columns=neworder)
However, as you can see, I only want to swap two columns. It was doable just because there are only 4 column, but what if I have like 100 columns? what would be an effective way to swap or reorder columns?
There might be 2 cases:
Reorder Columns using Pandas . Another way to reorder columns is to use the Pandas . reindex() method. This allows you to pass in the columns= parameter to pass in the order of columns that you want to use.
You can use df. sample(frac=1, axis=1). sample(frac=1). reset_index(drop=True) to shuffle rows and columns randomly.
Use the T attribute or the transpose() method to swap (= transpose) the rows and columns of pandas. DataFrame . Neither method changes the original object but returns a new object with the rows and columns swapped (= transposed object).
Say your current order of column is [b,c,d,a] and you want to order it into [a,b,c,d], you could do it this way:
new_df = old_df[['a', 'b', 'c', 'd']]
Two column Swapping
cols = list(df.columns)
a, b = cols.index('LastName'), cols.index('MiddleName')
cols[b], cols[a] = cols[a], cols[b]
df = df[cols]
Reorder column Swapping (2 swaps)
cols = list(df.columns)
a, b, c, d = cols.index('LastName'), cols.index('MiddleName'), cols.index('Contact'), cols.index('EmployeeID')
cols[a], cols[b], cols[c], cols[d] = cols[b], cols[a], cols[d], cols[c]
df = df[cols]
Swapping Multiple
Now it comes down to how you can play with list slices -
cols = list(df.columns)
cols = cols[1::2] + cols[::2]
df = df[cols]
When faced with same problem at larger scale, I came across a very elegant solution at this link: http://www.datasciencemadesimple.com/re-arrange-or-re-order-the-column-of-dataframe-in-pandas-python-2/ under the heading "Rearrange the column of dataframe by column position in pandas python".
Basically if you have the column order as a list, you can read that in as the new column order.
##### Rearrange the column of dataframe by column position in pandas python
df2=df1[df1.columns[[3,2,1,0]]]
print(df2)
In my case, I had a pre-calculated column linkage that determined the new order I wanted. If this order was defined as an array in L, then:
a_L_order = a[a.columns[L]]
If you want to have a fixed list of columns at the beginning, you could do something like
cols = ['EmployeeID','FirstName','MiddleName','LastName']
df = df[cols + [c for c in df.columns if c not in cols]]
This will put these 4 columns first and leave the rest untouched (without any duplicate column).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With