Is it possible to change the order of columns in a dataframe in place?
If yes, would that be faster than making a copy? I am working with a large dataframe with 100 million+ rows.
I see how to change the order with a copy: How to change the order of DataFrame columns?
Here is a short and even more memory efficient way (because no additional temporary variable needs to be saved):
df = pd.DataFrame({"A": [0, 1], "B": [2, 3], "C": [4, 5]})
new_order = ["B", "C", "A"]
for column in new_order:
df[column] = df.pop(column)
This works, because the new columns are assigned to the DataFrame in the new order and the old columns are deleted one by one. Pop returns a column and deletes it from the DataFrame.
Hmm... no one proposed drop and insert:
df = pd.DataFrame([['a','b','c']],columns=list('ABC'))
print('Before', id(df))
for i,col in enumerate(['C','B', 'A']):
tmp = df[col]
df.drop(labels=[col],axis=1,inplace=True)
df.insert(i,col,tmp)
print('After ', id(df))
df.head()
The result will preserve the original dataframe
Before 140441780394360
After 140441780394360
C B A
----------
0 c b a
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With