Making columns and ordering consistent in a Pandas DataFrame

Question

I am looking for the elegant, Pythonic way of making a Pandas DataFrame columns consistent. Meaning:

Ensure all the columns in a master list are present, and if not add in an empty placeholder column.
Ensure that the columns are in the same order as the master list.

I have the following example that works, but is there a built-in Pandas method for accomplishing the same goal?

import pandas as pd
df1 = pd.DataFrame(data=[{'a':1,'b':32, 'c':32}])
print df1

   a   b   c
0  1  32  32

column_master_list = ['b', 'c', 'e', 'd', 'a']
def get_dataframe_with_consistent_header(df, headers):
    for col in headers:
        if col not in df.columns:
            df[col] = pd.np.NaN
    return df[headers]

print get_dataframe_with_consistent_header(df1, column_master_list)

   b   c   e   d   a
0 32  32 NaN NaN   1

Alex Riley · Accepted Answer

You can use the reindex method. Pass in the list of column names and specify 'columns'. The fill value for missing entries is NaN by default:

>>> df1.reindex(column_master_list, axis='columns')
    b   c   e   d  a
0  32  32 NaN NaN  1

Making columns and ordering consistent in a Pandas DataFrame

Tags:

python

pandas

dataframe

skulz00

1 Answers

Alex Riley

Recent Activity

Donate For Us

Making columns and ordering consistent in a Pandas DataFrame

Tags:

python

pandas

dataframe

skulz00

1 Answers

Alex Riley

Related questions

Recent Activity

Donate For Us