Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to change the order of DataFrame columns?

I have the following DataFrame (df):

import numpy as np import pandas as pd  df = pd.DataFrame(np.random.rand(10, 5)) 

I add more column(s) by assignment:

df['mean'] = df.mean(1) 

How can I move the column mean to the front, i.e. set it as first column leaving the order of the other columns untouched?

like image 440
Timmie Avatar asked Oct 30 '12 22:10

Timmie


People also ask

How do I change the order of columns in a data frame?

You need to create a new list of your columns in the desired order, then use df = df[cols] to rearrange the columns in this new order.

How do I rearrange columns in pandas?

Reorder Columns using Pandas . Another way to reorder columns is to use the Pandas . reindex() method. This allows you to pass in the columns= parameter to pass in the order of columns that you want to use.

How do you reorder a data frame?

Reordering or Rearranging the column of dataframe in pandas python can be done by using reindex function. In order to reorder or rearrange the column in pandas python. We will be different methods. To reorder the column in ascending order we will be using Sort() function.

How do I change the order of columns in a spark Dataframe?

In order to Rearrange or reorder the column in pyspark we will be using select function. To reorder the column in ascending order we will be using Sorted function. To reorder the column in descending order we will be using Sorted function with an argument reverse =True. We also rearrange the column by position.


2 Answers

You could also do something like this:

df = df[['mean', '0', '1', '2', '3']] 

You can get the list of columns with:

cols = list(df.columns.values) 

The output will produce:

['0', '1', '2', '3', 'mean'] 

...which is then easy to rearrange manually before dropping it into the first function

like image 40
freddygv Avatar answered Sep 28 '22 05:09

freddygv


One easy way would be to reassign the dataframe with a list of the columns, rearranged as needed.

This is what you have now:

In [6]: df Out[6]:           0         1         2         3         4      mean 0  0.445598  0.173835  0.343415  0.682252  0.582616  0.445543 1  0.881592  0.696942  0.702232  0.696724  0.373551  0.670208 2  0.662527  0.955193  0.131016  0.609548  0.804694  0.632596 3  0.260919  0.783467  0.593433  0.033426  0.512019  0.436653 4  0.131842  0.799367  0.182828  0.683330  0.019485  0.363371 5  0.498784  0.873495  0.383811  0.699289  0.480447  0.587165 6  0.388771  0.395757  0.745237  0.628406  0.784473  0.588529 7  0.147986  0.459451  0.310961  0.706435  0.100914  0.345149 8  0.394947  0.863494  0.585030  0.565944  0.356561  0.553195 9  0.689260  0.865243  0.136481  0.386582  0.730399  0.561593  In [7]: cols = df.columns.tolist()  In [8]: cols Out[8]: [0L, 1L, 2L, 3L, 4L, 'mean'] 

Rearrange cols in any way you want. This is how I moved the last element to the first position:

In [12]: cols = cols[-1:] + cols[:-1]  In [13]: cols Out[13]: ['mean', 0L, 1L, 2L, 3L, 4L] 

Then reorder the dataframe like this:

In [16]: df = df[cols]  #    OR    df = df.ix[:, cols]  In [17]: df Out[17]:        mean         0         1         2         3         4 0  0.445543  0.445598  0.173835  0.343415  0.682252  0.582616 1  0.670208  0.881592  0.696942  0.702232  0.696724  0.373551 2  0.632596  0.662527  0.955193  0.131016  0.609548  0.804694 3  0.436653  0.260919  0.783467  0.593433  0.033426  0.512019 4  0.363371  0.131842  0.799367  0.182828  0.683330  0.019485 5  0.587165  0.498784  0.873495  0.383811  0.699289  0.480447 6  0.588529  0.388771  0.395757  0.745237  0.628406  0.784473 7  0.345149  0.147986  0.459451  0.310961  0.706435  0.100914 8  0.553195  0.394947  0.863494  0.585030  0.565944  0.356561 9  0.561593  0.689260  0.865243  0.136481  0.386582  0.730399 
like image 171
Aman Avatar answered Sep 28 '22 04:09

Aman