Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Move column by name to front of table in pandas

People also ask

How do I rearrange column positions in pandas?

Reorder Columns using Pandas . Another way to reorder columns is to use the Pandas . reindex() method. This allows you to pass in the columns= parameter to pass in the order of columns that you want to use.

How do I move a column in pandas?

df1 = df. pop('b') # remove column b and store it in df1 df2 = df. pop('x') # remove column x and store it in df2 df['b']=df1 # add b series as a 'new' column. df['x']=df2 # add b series as a 'new' column.

How do I add a column to the front of a DataFrame in Python?

In pandas you can add/append a new column to the existing DataFrame using DataFrame. insert() method, this method updates the existing DataFrame with a new column. DataFrame. assign() is also used to insert a new column however, this method returns a new Dataframe after adding a new column.


We can use ix to reorder by passing a list:

In [27]:
# get a list of columns
cols = list(df)
# move the column to head of list using index, pop and insert
cols.insert(0, cols.pop(cols.index('Mid')))
cols
Out[27]:
['Mid', 'Net', 'Upper', 'Lower', 'Zsore']
In [28]:
# use ix to reorder
df = df.ix[:, cols]
df
Out[28]:
                      Mid Net  Upper   Lower  Zsore
Answer_option                                      
More_than_once_a_day    2  0%  0.22%  -0.12%     65
Once_a_day              3  0%  0.32%  -0.19%     45
Several_times_a_week    4  2%  2.45%   1.10%     78
Once_a_week             6  1%  1.63%  -0.40%     65

Another method is to take a reference to the column and reinsert it at the front:

In [39]:
mid = df['Mid']
df.drop(labels=['Mid'], axis=1,inplace = True)
df.insert(0, 'Mid', mid)
df
Out[39]:
                      Mid Net  Upper   Lower  Zsore
Answer_option                                      
More_than_once_a_day    2  0%  0.22%  -0.12%     65
Once_a_day              3  0%  0.32%  -0.19%     45
Several_times_a_week    4  2%  2.45%   1.10%     78
Once_a_week             6  1%  1.63%  -0.40%     65

You can also use loc to achieve the same result as ix will be deprecated in a future version of pandas from 0.20.0 onwards:

df = df.loc[:, cols]

Maybe I'm missing something, but a lot of these answers seem overly complicated. You should be able to just set the columns within a single list:

Column to the front:

df = df[ ['Mid'] + [ col for col in df.columns if col != 'Mid' ] ]

Or if instead, you want to move it to the back:

df = df[ [ col for col in df.columns if col != 'Mid' ] + ['Mid'] ]

Or if you wanted to move more than one column:

cols_to_move = ['Mid', 'Zsore']
df           = df[ cols_to_move + [ col for col in df.columns if col not in cols_to_move ] ]

I prefer this solution:

col = df.pop("Mid")
df.insert(0, col.name, col)

It's simpler to read and faster than other suggested answers.

def move_column_inplace(df, col, pos):
    col = df.pop(col)
    df.insert(pos, col.name, col)

Performance assessment:

For this test, the currently last column is moved to the front in each repetition. In-place methods generally perform better. While citynorman's solution can be made in-place, Ed Chum's method based on .loc and sachinnm's method based on reindex cannot.

While other methods are generic, citynorman's solution is limited to pos=0. I didn't observe any performance difference between df.loc[cols] and df[cols], which is why I didn't include some other suggestions.

I tested with python 3.6.8 and pandas 0.24.2 on a MacBook Pro (Mid 2015).

import numpy as np
import pandas as pd

n_cols = 11
df = pd.DataFrame(np.random.randn(200000, n_cols),
                  columns=range(n_cols))

def move_column_inplace(df, col, pos):
    col = df.pop(col)
    df.insert(pos, col.name, col)

def move_to_front_normanius_inplace(df, col):
    move_column_inplace(df, col, 0)
    return df

def move_to_front_chum(df, col):
    cols = list(df)
    cols.insert(0, cols.pop(cols.index(col)))
    return df.loc[:, cols]

def move_to_front_chum_inplace(df, col):
    col = df[col]
    df.drop(col.name, axis=1, inplace=True)
    df.insert(0, col.name, col)
    return df

def move_to_front_elpastor(df, col):
    cols = [col] + [ c for c in df.columns if c!=col ]
    return df[cols] # or df.loc[cols]

def move_to_front_sachinmm(df, col):
    cols = df.columns.tolist()
    cols.insert(0, cols.pop(cols.index(col)))
    df = df.reindex(columns=cols, copy=False)
    return df

def move_to_front_citynorman_inplace(df, col):
    # This approach exploits that reset_index() moves the index
    # at the first position of the data frame.
    df.set_index(col, inplace=True)
    df.reset_index(inplace=True)
    return df

def test(method, df):
    col = np.random.randint(0, n_cols)
    method(df, col)

col = np.random.randint(0, n_cols)
ret_mine = move_to_front_normanius_inplace(df.copy(), col)
ret_chum1 = move_to_front_chum(df.copy(), col)
ret_chum2 = move_to_front_chum_inplace(df.copy(), col)
ret_elpas = move_to_front_elpastor(df.copy(), col)
ret_sach = move_to_front_sachinmm(df.copy(), col)
ret_city = move_to_front_citynorman_inplace(df.copy(), col)

# Assert equivalence of solutions.
assert(ret_mine.equals(ret_chum1))
assert(ret_mine.equals(ret_chum2))
assert(ret_mine.equals(ret_elpas))
assert(ret_mine.equals(ret_sach))
assert(ret_mine.equals(ret_city))

Results:

# For n_cols = 11:
%timeit test(move_to_front_normanius_inplace, df)
# 1.05 ms ± 42.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit test(move_to_front_citynorman_inplace, df)
# 1.68 ms ± 46.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit test(move_to_front_sachinmm, df)
# 3.24 ms ± 96.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_chum, df)
# 3.84 ms ± 114 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_elpastor, df)
# 3.85 ms ± 58.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_chum_inplace, df)
# 9.67 ms ± 101 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


# For n_cols = 31:
%timeit test(move_to_front_normanius_inplace, df)
# 1.26 ms ± 31.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_citynorman_inplace, df)
# 1.95 ms ± 260 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_sachinmm, df)
# 10.7 ms ± 348 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_chum, df)
# 11.5 ms ± 869 µs per loop (mean ± std. dev. of 7 runs, 100 loops each
%timeit test(move_to_front_elpastor, df)
# 11.4 ms ± 598 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit test(move_to_front_chum_inplace, df)
# 31.4 ms ± 1.89 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

You can use the df.reindex() function in pandas. df is

                      Net  Upper   Lower  Mid  Zsore
Answer option                                      
More than once a day  0%  0.22%  -0.12%    2     65
Once a day            0%  0.32%  -0.19%    3     45
Several times a week  2%  2.45%   1.10%    4     78
Once a week           1%  1.63%  -0.40%    6     65

define an list of column names

cols = df.columns.tolist()
cols
Out[13]: ['Net', 'Upper', 'Lower', 'Mid', 'Zsore']

move the column name to wherever you want

cols.insert(0, cols.pop(cols.index('Mid')))
cols
Out[16]: ['Mid', 'Net', 'Upper', 'Lower', 'Zsore']

then use df.reindex() function to reorder

df = df.reindex(columns= cols)

out put is: df

                      Mid  Upper   Lower Net  Zsore
Answer option                                      
More than once a day    2  0.22%  -0.12%  0%     65
Once a day              3  0.32%  -0.19%  0%     45
Several times a week    4  2.45%   1.10%  2%     78
Once a week             6  1.63%  -0.40%  1%     65