I have a data frame with a column named SAM with following data
SAM
3
5
9
Nan
Nan
24
40
Nan
57
Now I want to Insert 12
, 15
and 43
respectively in the Nan
values (because 9+3=12
, 12+3=15
, and 40+3=43
). In other words, fill any Nan
row by adding 3
to previous row (which can also be Nan
).
I know this can be done by iterating through a for loop. But can we do it in a vectorized manner? Like some modified version of ffill
(which could have been used here if we didn't have consecutive NaNs) in pandas.fillna()
.
pandas.DataFrame.fillna () method is used to fill column (one or multiple columns) contains NA/NaN/None with 0, empty, blank or any specified values e.t.c. NaN is considered a missing value. When you dealing with machine learning handling missing values is very important, not handling these will result in a side effect with an incorrect result.
Use pandas fillna () method to fill a specified value on multiple DataFrame columns, the below example update columns Discount and Fee with 0 for NaN values. Now, let’s see how to fill different value for each column.
It is used to fill NaN values with specified values (0, blank, e.t.c). If you want to consider infinity ( inf and -inf ) to be “NA” in computations, you can set pandas.options.mode.use_inf_as_na = True. Besides NaN, pandas None also considers as missing. 1. Quick Examples of pandas fillna ()
Pandas has different methods like bfill, backfill or ffill which fills the place with value in the Forward index or Previous/Back respectively. axis: axis takes int or string value for rows/columns. inplace: It is a boolean which makes the changes in data frame itself if True.
You can try this vectorized approach:
nul = df['SAM'].isnull()
nul.groupby((nul.diff() == 1).cumsum()).cumsum()*3 + df['SAM'].ffill()
#0 3.0
#1 5.0
#2 9.0
#3 12.0
#4 15.0
#5 24.0
#6 40.0
#7 43.0
#8 57.0
#Name: SAM, dtype: float64
SAM
column to the result.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With