I have 2 dataframes:
>>> type(c)
Out[118]: pandas.core.frame.DataFrame
>>> type(N)
Out[119]: pandas.core.frame.DataFrame
>>> c
Out[114]:
t
2017-06-01 01:06:00 1.00
2017-06-01 01:13:00 1.00
2017-06-01 02:09:00 1.00
2017-06-26 22:47:00 1.00
>>> N
Out[115]:
0 1
2017-06-01 01:06:00 1.00 1.00
2017-06-01 01:13:00 1.00 1.00
2017-06-01 02:09:00 1.00 1.00
2017-06-26 22:47:00 1.00 1.00
I need to multiply these together to get a 4,2 dataframe that is multiplication of each column of N elementwise with the C. I tried the following 4 approaches with no luck:
>>> N.multiply(c, axis='index')
Out[116]:
0 1 t
2017-06-01 01:06:00 nan nan nan
2017-06-01 01:13:00 nan nan nan
2017-06-01 02:09:00 nan nan nan
2017-06-26 22:47:00 nan nan nan
>>> c[:]*N
Out[98]:
0 1 t
2017-06-01 01:06:00 nan nan nan
2017-06-01 01:13:00 nan nan nan
2017-06-01 02:09:00 nan nan nan
2017-06-26 22:47:00 nan nan nan
>>> c*N
Out[99]:
0 1 t
2017-06-01 01:06:00 nan nan nan
2017-06-01 01:13:00 nan nan nan
2017-06-01 02:09:00 nan nan nan
2017-06-26 22:47:00 nan nan nan
>>> c[:, None]*N
Traceback (most recent call last):
File "C:\...pandas\core\frame.py", line 1797, in __getitem__
return self._getitem_column(key)
File "C:\...core\frame.py", line 1804, in _getitem_column
return self._get_item_cache(key)
File "C:\...core\generic.py", line 1082, in _get_item_cache
res = cache.get(item)
TypeError: unhashable type
Is there a way, with or without broadcasting to do this easily?
The problem is that you pass a DataFrame so it tries to match the column names too. If you slice the column t, it will become a Series and it will broadcast appropriately:
N.mul(c['t'], axis=0)
Out:
0 1
2017-06-01 01:06:00 1.0 1.0
2017-06-01 01:13:00 1.0 1.0
2017-06-01 02:09:00 1.0 1.0
2017-06-26 22:47:00 1.0 1.0
In the case of numpy arrays, you don't need to specify anything. With shapes of (4, 2) and (4, 1) numpy will see the axis with the same length and broadcast accordingly.
Consider the following DataFrames:
N
Out:
0 1
2017-06-01 01:06:00 1.0 2.0
2017-06-01 01:13:00 6.0 5.0
2017-06-01 02:09:00 4.0 3.0
2017-06-26 22:47:00 4.0 7.0
c
Out:
t
2017-06-01 01:06:00 6.0
2017-06-01 01:13:00 2.0
2017-06-01 02:09:00 8.0
2017-06-26 22:47:00 2.0
You can access the underlying array with the .values attribute so
N.values * c.values
Out:
array([[ 6., 12.],
[ 12., 10.],
[ 32., 24.],
[ 8., 14.]])
will give you the same result as
N.mul(c['t'], axis=0)
Out:
0 1
2017-06-01 01:06:00 6.0 12.0
2017-06-01 01:13:00 12.0 10.0
2017-06-01 02:09:00 32.0 24.0
2017-06-26 22:47:00 8.0 14.0
But since the whole operation is in numpy, you will lose the labels.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With