pandas shift rows NaNs

Tags:

pandas

Say we have a dataframe set up as follows:

x = pd.DataFrame(np.random.randint(1, 10, 30).reshape(5,6),
                 columns=[f'col{i}' for i in range(6)])
x['col6'] = np.nan
x['col7'] = np.nan

    col0    col1    col2    col3    col4    col5    col6    col7
 0   6       5        1       5       2       4      NaN    NaN
 1   8       8        9       6       7       2      NaN    NaN
 2   8       3        9       6       6       6      NaN    NaN
 3   8       4        4       4       8       9      NaN    NaN
 4   5       3        4       3       8       7      NaN    NaN

When calling x.shift(2, axis=1), col2 -> col5 shifts correctly, but col6 and col7 stays as NaN? How can I overwrite the NaN in col6 and col7 values with col4 and col5's values? Is this a bug or intended?

Click to copy

    col0    col1    col2    col3    col4    col5    col6    col7
0   NaN      NaN    6.0     5.0     1.0      5.0    NaN     NaN
1   NaN      NaN    8.0     8.0     9.0      6.0    NaN     NaN
2   NaN      NaN    8.0     3.0     9.0      6.0    NaN     NaN
3   NaN      NaN    8.0     4.0     4.0      4.0    NaN     NaN
4   NaN      NaN    5.0     3.0     4.0      3.0    NaN     NaN

944

asked Feb 09 '18 09:02

A H

1 Answers

It's possible this is a bug, you can use np.roll to achieve this:

Click to copy

In[11]:
x.apply(lambda x: np.roll(x, 2), axis=1)

Out[11]: 
   col0  col1  col2  col3  col4  col5  col6  col7
0   NaN   NaN   6.0   5.0   1.0   5.0   2.0   4.0
1   NaN   NaN   8.0   8.0   9.0   6.0   7.0   2.0
2   NaN   NaN   8.0   3.0   9.0   6.0   6.0   6.0
3   NaN   NaN   8.0   4.0   4.0   4.0   8.0   9.0
4   NaN   NaN   5.0   3.0   4.0   3.0   8.0   7.0

Speedwise, it's probably quicker to construct a df and reuse the existing columns and pass the result of np.roll as the data arg to the constructor to DataFrame:

Click to copy

In[12]:
x = pd.DataFrame(np.roll(x, 2, axis=1), columns = x.columns)
x

Out[12]: 
   col0  col1  col2  col3  col4  col5  col6  col7
0   NaN   NaN   6.0   5.0   1.0   5.0   2.0   4.0
1   NaN   NaN   8.0   8.0   9.0   6.0   7.0   2.0
2   NaN   NaN   8.0   3.0   9.0   6.0   6.0   6.0
3   NaN   NaN   8.0   4.0   4.0   4.0   8.0   9.0
4   NaN   NaN   5.0   3.0   4.0   3.0   8.0   7.0

timings

Click to copy

In[13]:

%timeit pd.DataFrame(np.roll(x, 2, axis=1), columns = x.columns)
%timeit x.fillna(0).astype(int).shift(2, axis=1)

10000 loops, best of 3: 117 µs per loop
1000 loops, best of 3: 418 µs per loop

So constructing a new df with the result of np.roll is quicker than first filling the NaN values, cast to int, and then shifting.

133

answered Oct 22 '22 15:10

EdChum

Related questions
                            
                                Celery Process 'Worker' exited with 'exitcode 1' [duplicate]
                            
                                AWS Boto / Warrant library: SRP authentication and credentials error
                            
                                How to prefetch a @property with a Django queryset?
                            
                                For loop is overwriting dictionary values in list [duplicate]
                            
                                Accessing static fields from the decorated class
                            
                                VisPy animation point by point from NumPy array
                            
                                How to estimate eps using knn distance plot in DBSCAN
                            
                                Iterable object and Django StreamingHttpResponse
                            
                                Plane-plane intersection in python [closed]
                            
                                How to implement a custom layer wit multiple outputs in Keras?
                            
                                How to have limited ZMQ (ZeroMQ - PyZMQ) queue buffer size in python?
                            
                                Generating n binary vectors where each vector has a Hamming distance of d from every other vector
                            
                                AWS Elastic Beanstalk failed to install Python package using requirements.txt Git Pip
                            
                                How to run python code line by line in Spyder and include loop/if statement contents
                            
                                How do you serialize a union field in Avro using Python when attributes match
                            
                                How to make a Parameter available to all Luigi Tasks?
                            
                                pytorch variable index lost one dimension
                            
                                Fill oceans in basemap [duplicate]
                            
                                how to make a https request in python 3
                            
                                Python remove hashtag symbol and keep key words

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

pandas shift rows NaNs

Tags:

python

pandas

A H

People also ask

1 Answers

EdChum

Recent Activity

Donate For Us