Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Set order of columns in pandas dataframe

Tags:

python

pandas

Just select the order yourself by typing in the column names. Note the double brackets:

frame = frame[['column I want first', 'column I want second'...etc.]]

You can use this:

columnsTitles = ['onething', 'secondthing', 'otherthing']

frame = frame.reindex(columns=columnsTitles)

Here is a solution I use very often. When you have a large data set with tons of columns, you definitely do not want to manually rearrange all the columns.

What you can and, most likely, want to do is to just order the first a few columns that you frequently use, and let all other columns just be themselves. This is a common approach in R. df %>%select(one, two, three, everything())

So you can first manually type the columns that you want to order and to be positioned before all the other columns in a list cols_to_order.

Then you construct a list for new columns by combining the rest of the columns:

new_columns = cols_to_order + (frame.columns.drop(cols_to_order).tolist())

After this, you can use the new_columns as other solutions suggested.

import pandas as pd
frame = pd.DataFrame({
    'one thing': [1, 2, 3, 4],
    'other thing': ['a', 'e', 'i', 'o'],
    'more things': ['a', 'e', 'i', 'o'],
    'second thing': [0.1, 0.2, 1, 2],
})

cols_to_order = ['one thing', 'second thing']
new_columns = cols_to_order + (frame.columns.drop(cols_to_order).tolist())
frame = frame[new_columns]

   one thing  second thing other thing more things
0          1           0.1           a           a
1          2           0.2           e           e
2          3           1.0           i           i
3          4           2.0           o           o

You could also do something like df = df[['x', 'y', 'a', 'b']]

import pandas as pd
frame = pd.DataFrame({'one thing':[1,2,3,4],'second thing':[0.1,0.2,1,2],'other thing':['a','e','i','o']})
frame = frame[['second thing', 'other thing', 'one thing']]
print frame
   second thing other thing  one thing
0           0.1           a          1
1           0.2           e          2
2           1.0           i          3
3           2.0           o          4

Also, you can get the list of columns with:

cols = list(df.columns.values)

The output will produce something like this:

['x', 'y', 'a', 'b']

Which is then easy to rearrange manually.