I have the following two dataframes:
df:
value
period
2000-01-01 100
2000-04-01 200
2000-07-01 300
2000-10-01 400
2001-01-01 500
df1:
value
period
2000-07-01 350
2000-10-01 450
2001-01-01 550
2001-04-01 600
2001-07-01 700
This is the desired output:
df:
value
period
2000-01-01 100
2000-04-01 200
2000-07-01 350
2000-10-01 450
2001-01-01 550
2001-04-01 600
2001-07-01 700
I have set_index(['period'])
on both df1 and df2. I also tried few things including concat and where statement after creating new column but notting works as expected. My first dataframe is primary. The second is kind of update. It should replace the corresponding values in the first one and in the same time add new records if any available.
How I can do this?
You can use combine_first
, also if dtype
of some index is object
convert to_datetime
which works nice if always df1.index
is in df.index
:
print (df.index.dtype)
object
print (df1.index.dtype)
object
df.index = pd.to_datetime(df.index)
df1.index = pd.to_datetime(df1.index)
df = df1.combine_first(df)
#if necessary int columns
#df = df1.combine_first(df).astype(int)
print (df)
value
period
2000-01-01 100.0
2000-04-01 200.0
2000-07-01 350.0
2000-10-01 450.0
2001-01-01 550.0
2001-04-01 600.0
2001-07-01 700.0
If not, then is necessary filter by intersection
first:
df = df1.loc[df1.index.intersection(df.index)].combine_first(df)
Another solution with numpy.setdiff1d
and concat
df = pd.concat([df.loc[np.setdiff1d(df.index, df1.index)], df1])
print (df)
value
period
2000-01-01 100
2000-04-01 200
2000-07-01 350
2000-10-01 450
2001-01-01 550
2001-04-01 600
2001-07-01 700
Is that what you want?
In [151]: pd.concat([df1, df.loc[df.index.difference(df1.index)]]).sort_index()
Out[151]:
value
period
2000-01-01 100
2000-04-01 200
2000-07-01 350
2000-10-01 450
2001-01-01 550
2001-04-01 600
2001-07-01 700
PS make sure that both indices are of the same dtype - it's better to convert them to datetime
dtype, using pd.to_datetime()
method
Another option with append
and drop_duplicates
d1 = df1.append(df)
d1[~d1.index.duplicated()]
value
period
2000-07-01 350
2000-10-01 450
2001-01-01 550
2001-04-01 600
2001-07-01 700
2000-01-01 100
2000-04-01 200
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With