Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I get panda's "update" function to overwrite numbers in one column but not another?

Currently, I'm using:

csvdata.update(data, overwrite=True)

How can I make it update and overwrite a specific column but not another, small but simple question, is there a simple answer?

like image 492
Ryflex Avatar asked Dec 16 '22 04:12

Ryflex


2 Answers

Rather than update with the entire DataFrame, just update with the subDataFrame of columns which you are interested in. For example:

In [11]: df1
Out[11]: 
   A   B
0  1  99
1  3  99
2  5   6

In [12]: df2
Out[12]: 
   A  B
0  a  2
1  b  4
2  c  6

In [13]: df1.update(df2[['B']])  # subset of cols = ['B']

In [14]: df1
Out[14]: 
   A  B
0  1  2
1  3  4
2  5  6
like image 198
Andy Hayden Avatar answered Dec 21 '22 23:12

Andy Hayden


If you want to do it for a single column:

import pandas
import numpy
csvdata = pandas.DataFrame({"a":range(12), "b":range(12)})
other = pandas.Series(list("abcdefghijk")+[numpy.nan])

csvdata["a"].update(other)

print csvdata

     a   b
0    a   0
1    b   1
2    c   2
3    d   3
4    e   4
5    f   5
6    g   6
7    h   7
8    i   8
9    j   9
10   k  10
11  11  11

or, as long as the column names match, you can do this:

other = pandas.DataFrame({"a":list("abcdefghijk")+[numpy.nan], "b":list("abcdefghijk")+[numpy.nan]})
csvdata.update(other["a"])
like image 20
Noah Avatar answered Dec 21 '22 22:12

Noah