So I have a [Python2.7] Pandas dataframe (df) as below:
name flag dummy_D random ID dummy_S dummy_T
0 Mick Purple 2 NaN 1 21 32
1 John Red NaN NaN 2 w32 4
2 Christine NaN 2 NaN 2 w33 3
3 Stevie NaN 4 NaN 2 w34 2
4 Lindsey NaN 5 NaN 2 w35 NaN
and I would like to replace all the NaN in columns stating with 'dummy' with previous values (and only these columns while the rest of the dataframe remain unchanged)
Here is what I did:
dummycol = [col for col in df.columns if 'dummy' in col]
for d in dummycol:
df[d] = df[d].fillna(method = 'pad')
My question is:
Is there a better (in terms of coding and memory efficiency) way in Pandas to do this instead of wasting memory to create a list + looping through it? Would be great to have a one liner solution!
Many Thanks in advance!
Will
You could do it this way, so you can call str.startswith on the columns to get the cols of interest and then call fillna on all those columns at the same time:
In [152]:
cols = df.columns[df.columns.str.startswith('dummy')]
df[cols] = df[cols].fillna(method='pad')
df
Out[152]:
name flag dummy_D random ID dummy_S dummy_T
0 Mick Purple 2 NaN 1 21 32
1 John Red 2 NaN 2 w32 4
2 Christine NaN 2 NaN 2 w33 3
3 Stevie NaN 4 NaN 2 w34 2
4 Lindsey NaN 5 NaN 2 w35 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With