I have a pandas DataFrame, something like: <pre class="prettyprint"><code>col1 col2 col3 col5 NaN 1 2 8 2 NaN 4 8 4 NaN 4 8 </code></pre> I want to do two things: 1) Merge columns 1 and 2: <pre class="prettyprint"><code>newcol1 col3 col5 1 2 8 2 4 8 4 4 8 </code></pre> I have tried using .concat, but that just concatenates the rows. Doesn't seem like I can use standard <code>+</code> operators with NaN values. 2) Subtract column 5 from new column 1 and column 3, so I end up with: <pre class="prettyprint"><code>newcol1 col3 -7 -6 -6 -4 -4 -4 </code></pre> Tried doing it this way: <pre class="prettyprint"><code>dataframe[['newcol1', 'col2']] - dataframe['col5'] </code></pre> and <pre class="prettyprint"><code>dataframe[['newcol1', 'col2']].subtract(dataframe['col5']) </code></pre> but neither works.

To get the new column, you could use <code>fillna</code> (or <code>combine_first</code>): <pre class="prettyprint"><code>df['newcol1'] = df.col1.fillna(df.col2) </code></pre> Then for the subtraction, use <code>sub</code> and specify <code>axis=0</code> since we want to consider the row indices when matching up labels (not the column indices as is the default): <pre class="prettyprint"><code>>>> df[['newcol1', 'col3']].sub(df['col5'], axis=0) newcol1 col3 0 -7 -6 1 -6 -4 2 -4 -4 </code></pre>

Merging and subtracting DataFrame columns in pandas?

Tags:

python

pandas

dataframe

numpy

I have a pandas DataFrame, something like:

col1  col2 col3 col5
NaN    1    2    8
2     NaN   4    8
4     NaN   4    8

I want to do two things:

1) Merge columns 1 and 2:

newcol1 col3 col5
1       2    8
2       4    8
4       4    8

I have tried using .concat, but that just concatenates the rows. Doesn't seem like I can use standard + operators with NaN values.

2) Subtract column 5 from new column 1 and column 3, so I end up with:

newcol1    col3
-7         -6
-6         -4
-4         -4

Tried doing it this way:

dataframe[['newcol1', 'col2']] - dataframe['col5']

and

dataframe[['newcol1', 'col2']].subtract(dataframe['col5'])

but neither works.

346

asked Apr 23 '15 19:04

user1566200

2 Answers

To get the new column, you could use fillna (or combine_first):

df['newcol1'] = df.col1.fillna(df.col2)

Then for the subtraction, use sub and specify axis=0 since we want to consider the row indices when matching up labels (not the column indices as is the default):

>>> df[['newcol1', 'col3']].sub(df['col5'], axis=0)
   newcol1  col3
0       -7    -6
1       -6    -4
2       -4    -4

answered Sep 21 '22 08:09

Alex Riley

Here's one approach.

You could create newcol1 by sum(axis=1)

In [256]: df['newcol1'] = df[['col1', 'col2']].sum(axis=1)

In [257]: df
Out[257]:
   col1  col2  col3  col5  newcol1
0   NaN     1     2     8        1
1     2   NaN     4     8        2
2     4   NaN     4     8        4

Then use df.sub() on axis=0

In [258]: df[['newcol1', 'col3']].sub(df['col5'], axis=0)
Out[258]:
   newcol1  col3
0       -7    -6
1       -6    -4
2       -4    -4

answered Sep 18 '22 08:09

Zero

Related questions
                            
                                two Lists to Json Format in python
                            
                                Python cross correlation
                            
                                For loop in unittest
                            
                                How to install libpython2.7.so
                            
                                How to embed python in an Objective-C OS X application for plugins?
                            
                                plotting the projection of 3D plot in three planes using contours
                            
                                Average line for bar chart in matplotlib
                            
                                Sorted bar charts with pandas/matplotlib or seaborn
                            
                                Use first row as column names? Pandas read_html
                            
                                Python multiprocessing - tracking the process of pool.map operation
                            
                                Delete pdf files in folders and subfolders with python?
                            
                                Cython/Python/C++ - Inheritance: Passing Derived Class as Argument to Function expecting base class
                            
                                python dict implementation details [duplicate]
                            
                                Negation handling in NLP
                            
                                k-means with selected initial centers
                            
                                Error in Tumblelog Application development using Flask and MongoEngine
                            
                                Tornado framework. TypeError: 'Future' object is not callable
                            
                                sklearn matrix factorization example
                            
                                google-app-engine 1.9.19 deploy failure
                            
                                How do I convert a .tsv to .csv?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With