I have the following hourly dataframe dfA:
Date/Time Value1 Value2
01.03.2010 00:00:00 60 10
01.03.2010 01:00:00 50 20
01.03.2010 02:00:00 52 30
01.03.2010 03:00:00 49 40
.
.
.
31.12.2013 23:00:00 77 50
And I have a second dataframe dfB with annual vaules:
Date/Time Value1 Value2
31.12.2010 1.5 0.9
31.12.2011 1.6 1.1
31.12.2012 1.7 2.3
31.12.2013 1.3 0.6
I would like to multiply each hourly value in dfA with the factor of the corresponding year in dataframe dfB.
The result should look like this:
Date/Time Value1 Value2
01.03.2010 00:00:00 90 9
01.03.2010 01:00:00 75 18
01.03.2010 02:00:00 78 27
01.03.2010 03:00:00 73.5 36
.
.
.
31.12.2013 23:00:00 100.1 30
I have been trying with dfC = dfA*dfB[dfA.index.year()]
but I am getting the error TypeError: 'numpy.ndarray' object is not callable
.
Can anyone help me with this?
You can try append to index of df1
df1.index.year
, then change index of df2
to years
and then use mul
:
print df1
Value1 Value2
Date/Time
2010-01-03 00:00:00 60 10
2010-01-03 01:00:00 50 20
2010-01-03 02:00:00 52 30
2010-01-03 03:00:00 49 40
2013-12-31 23:00:00 77 50
print df2
Value1 Value2
Date/Time
2010-12-31 1.5 0.9
2011-12-31 1.6 1.1
2012-12-31 1.7 2.3
2013-12-31 1.3 0.6
df1 = df1.set_index(df1.index.year, append=True)
df2.index = df2.index.year
print df1
Value1 Value2
Date/Time
2010-01-03 00:00:00 2010 60 10
2010-01-03 01:00:00 2010 50 20
2010-01-03 02:00:00 2010 52 30
2010-01-03 03:00:00 2010 49 40
2013-12-31 23:00:00 2013 77 50
print df2
Value1 Value2
2010 1.5 0.9
2011 1.6 1.1
2012 1.7 2.3
2013 1.3 0.6
print df1.mul(df2, level=1).reset_index(drop=True, level=1)
Value1 Value2
Date/Time
2010-01-03 00:00:00 90.0 9
2010-01-03 01:00:00 75.0 18
2010-01-03 02:00:00 78.0 27
2010-01-03 03:00:00 73.5 36
2013-12-31 23:00:00 100.1 30
The easiest way is probably to upsample the lower frequency series / data frame to the higher frequency, and then multiply.
In [82]: s1 = pd.Series(np.random.randn(31), index=pd.date_range(start='2015-01-01', end='2015-01-31', freq='d'))
In [83]: s2 = pd.Series(np.random.randn(4), index=pd.date_range(start='2015-01-01', end='2015-01-31', freq='W'))
In [84]: s2.resample('d').mean().reindex(s1.index).ffill().bfill() * s1
Out[84]:
2015-01-01 0.361944
2015-01-02 2.806391
2015-01-03 0.741745
2015-01-04 0.855619
2015-01-05 -2.127828
...
2015-01-27 0.533919
2015-01-28 0.792278
2015-01-29 1.722754
2015-01-30 0.822032
2015-01-31 0.729741
Freq: D, dtype: float64
We need that additional redindex(s1.index)
in this case since the start and end don't align exactly.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With