I have a DataFrame as below.
df = pd.DataFrame(
{
"code": ["AA", "BB", "CC","DD"],
"YA" : [2,1,1,np.nan],
"YD" : [1,np.nan,np.nan,1],
"ZB" : [1,np.nan,np.nan,np.nan],
"ZD" : [1,np.nan,np.nan,1]
}
)
Also, I have a sorting list.
sort_list = ['YD','YA', 'ZD', 'YB', 'ZA', 'ZB']
I am trying to add the missing columns based on the sort list and sort the DataFrame.
expected output:
code YD YA ZD YB ZA ZB
0 AA 1.0 2.0 1.0 NaN NaN 1.0
1 BB NaN 1.0 NaN NaN NaN NaN
2 CC NaN 1.0 NaN NaN NaN NaN
3 DD 1.0 NaN 1.0 NaN NaN NaN
I can get the result using the below code. Is there another(simple) way to do this?
my code:
col_list = list(set(sort_list) - set(df.columns.to_list()))
df1 = pd.DataFrame(index=df.index, columns=col_list)
df1 = df1.fillna(np.nan)
df2 = df.join(df1, how='left')
df2 = df2.set_index('code')
df2 = df2[sort_list]
df2 = df2.reset_index()
df2
There are multiple ways to add a new empty/blank column (single or multiple columns) to a pandas DataFrame by using assign operator, assign() , insert() and apply() methods. By using these you can add one or multiple empty columns with either NaN , None , Blank or Empty string values to all cells.
Numpy library is used to import NaN value and use its functionality. Method 2: Using Dataframe. reindex(). This method is used to create new columns in a dataframe and assign value to these columns(if not assigned, null will be assigned automatically).
try using reindex
:
df = df.reindex(columns=['code'] + sort_list)
df:
code YD YA ZD YB ZA ZB
0 AA 1.0 2.0 1.0 NaN NaN 1.0
1 BB NaN 1.0 NaN NaN NaN NaN
2 CC NaN 1.0 NaN NaN NaN NaN
3 DD 1.0 NaN 1.0 NaN NaN NaN
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With