Currently, I'm using:
csvdata.update(data, overwrite=True)
How can I make it update and overwrite a specific column but not another, small but simple question, is there a simple answer?
Rather than update with the entire DataFrame, just update with the subDataFrame of columns which you are interested in. For example:
In [11]: df1
Out[11]:
A B
0 1 99
1 3 99
2 5 6
In [12]: df2
Out[12]:
A B
0 a 2
1 b 4
2 c 6
In [13]: df1.update(df2[['B']]) # subset of cols = ['B']
In [14]: df1
Out[14]:
A B
0 1 2
1 3 4
2 5 6
If you want to do it for a single column:
import pandas
import numpy
csvdata = pandas.DataFrame({"a":range(12), "b":range(12)})
other = pandas.Series(list("abcdefghijk")+[numpy.nan])
csvdata["a"].update(other)
print csvdata
a b
0 a 0
1 b 1
2 c 2
3 d 3
4 e 4
5 f 5
6 g 6
7 h 7
8 i 8
9 j 9
10 k 10
11 11 11
or, as long as the column names match, you can do this:
other = pandas.DataFrame({"a":list("abcdefghijk")+[numpy.nan], "b":list("abcdefghijk")+[numpy.nan]})
csvdata.update(other["a"])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With