I have a data frame with a column named SAM with following data <pre class="prettyprint"><code>SAM 3 5 9 Nan Nan 24 40 Nan 57 </code></pre> Now I want to Insert <code>12</code>, <code>15</code> and <code>43</code> respectively in the <code>Nan</code> values (because <code>9+3=12</code>, <code>12+3=15</code>, and <code>40+3=43</code>). In other words, fill any <code>Nan</code> row by adding <code>3</code> to previous row (which can also be <code>Nan</code>). I know this can be done by iterating through a for loop. But can we do it in a vectorized manner? Like some modified version of <code>ffill</code> (which could have been used here if we didn't have consecutive NaNs) in <code>pandas.fillna()</code>.

You can try this vectorized approach: <pre class="prettyprint"><code>nul = df['SAM'].isnull() nul.groupby((nul.diff() == 1).cumsum()).cumsum()*3 + df['SAM'].ffill() #0 3.0 #1 5.0 #2 9.0 #3 12.0 #4 15.0 #5 24.0 #6 40.0 #7 43.0 #8 57.0 #Name: SAM, dtype: float64 </code></pre> <ol> <li>Divide the missing values in the series into chunks and add 3,6,9 etc to the missing value positions depending on the length of each chunk;</li> <li>Add the forward filled values from <code>SAM</code> column to the result.</li> </ol>

Fill na values by adding x to previous row pandas

Tags:

python

pandas

dataframe

I have a data frame with a column named SAM with following data

Click to copy

SAM
3
5
9
Nan
Nan
24
40
Nan
57

Now I want to Insert 12, 15 and 43 respectively in the Nan values (because 9+3=12, 12+3=15, and 40+3=43). In other words, fill any Nan row by adding 3 to previous row (which can also be Nan).

I know this can be done by iterating through a for loop. But can we do it in a vectorized manner? Like some modified version of ffill (which could have been used here if we didn't have consecutive NaNs) in pandas.fillna().

859

asked Dec 14 '16 15:12

Vijay P R

1 Answers

You can try this vectorized approach:

Click to copy

nul = df['SAM'].isnull()
nul.groupby((nul.diff() == 1).cumsum()).cumsum()*3 + df['SAM'].ffill()

#0     3.0
#1     5.0
#2     9.0
#3    12.0
#4    15.0
#5    24.0
#6    40.0
#7    43.0
#8    57.0
#Name: SAM, dtype: float64

Divide the missing values in the series into chunks and add 3,6,9 etc to the missing value positions depending on the length of each chunk;
Add the forward filled values from SAM column to the result.

150

answered Sep 30 '22 00:09

Psidom

Related questions
                            
                                Matplotlib odd subplots
                            
                                Taking characters out of list and turning them into other characters
                            
                                Source code for str.split?
                            
                                Building conda skeleton from pypi package throws error
                            
                                Is it possible to overwrite str's % behaviour using __rmod__?
                            
                                How to aggregate values over a bigger than RAM gzip'ed csv file?
                            
                                PyQt5 Signals and Threading
                            
                                Deskewing scanned image to match original image using OpenCV and SIFT/SURF
                            
                                Why doesn't exec("break") work inside a while loop
                            
                                How to insert a comment line to YAML in Python using ruamel.yaml?
                            
                                Don't require all the positional arguments if an optional argument is present
                            
                                What permission/user does apache2 use to write django logs
                            
                                can i access a unix domain socket on a remote machine?
                            
                                Keras - Fusion of a Dense Layer with a Convolution2D Layer
                            
                                Pandas count consecutive date observations within groupby object
                            
                                What's the point of @staticmethod in Python?
                            
                                Debugging TensorFlow tests: pdb or gdb?
                            
                                How to use a python library that is constantly changing in a docker image or new container?
                            
                                Redefining python built-in function
                            
                                pandas - Selecting pair of consecutive rows matching criteria

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With