Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

change Pandas dataframe column order in place

Tags:

python

pandas

Is it possible to change the order of columns in a dataframe in place?

If yes, would that be faster than making a copy? I am working with a large dataframe with 100 million+ rows.

I see how to change the order with a copy: How to change the order of DataFrame columns?

like image 243
Ivan Avatar asked Sep 16 '14 20:09

Ivan


2 Answers

Here is a short and even more memory efficient way (because no additional temporary variable needs to be saved):

df = pd.DataFrame({"A": [0, 1], "B": [2, 3], "C": [4, 5]})

new_order = ["B", "C", "A"]
for column in new_order:
    df[column] = df.pop(column)

This works, because the new columns are assigned to the DataFrame in the new order and the old columns are deleted one by one. Pop returns a column and deletes it from the DataFrame.

like image 164
JulianWgs Avatar answered Sep 29 '22 02:09

JulianWgs


Hmm... no one proposed drop and insert:

df = pd.DataFrame([['a','b','c']],columns=list('ABC'))

print('Before', id(df))

for i,col in enumerate(['C','B', 'A']):
    tmp = df[col]
    df.drop(labels=[col],axis=1,inplace=True)
    df.insert(i,col,tmp)    
    
print('After ', id(df))
df.head()

The result will preserve the original dataframe

Before 140441780394360
After  140441780394360

   C    B   A
   ----------
0  c    b   a
like image 28
belz Avatar answered Sep 29 '22 02:09

belz