I want to multiply a lookup table (demand
), given for multiple commodities (here: Water, Elec) and area types (Com, Ind, Res) with a DataFrame (areas
) that is a table of areas for these area types.
import pandas as pd
areas = pd.DataFrame({'Com':[1,2,3], 'Ind':[4,5,6]})
demand = pd.DataFrame({'Water':[4,3],
'Elec':[8,9]}, index=['Com', 'Ind'])
Before:
areas
Com Ind
0 1 4
1 2 5
2 3 6
demand
Elec Water
Com 8 4
Ind 9 3
After:
area_demands
Com Ind
Elec Water Elec Water
0 8 4 36 12
1 16 8 45 15
2 24 12 54 18
My attempt
Verbose and incomplete; does not work for arbitrary number of commodities.
areas = pd.DataFrame({'area': areas.stack()})
areas.index.names = ['Edge', 'Type']
both = areas.reset_index(1).join(demand, on='Type')
both['Elec'] = both['Elec'] * both['area']
both['Water'] = both['Water'] * both['area']
del both['area']
# almost there; it must be late, I fail to make 'Type' a hierarchical column...
Almost there:
Type Elec Water
Edge
0 Com 8 4
0 Ind 36 12
1 Com 16 8
1 Ind 45 15
2 Com 24 12
2 Ind 54 18
In short
How to join/multiply the DataFrames areas
and demand
together in a decent way?
import pandas as pd
areas = pd.DataFrame({'Com':[1,2,3], 'Ind':[4,5,6]})
demand = pd.DataFrame({'Water':[4,3],
'Elec':[8,9]}, index=['Com', 'Ind'])
def multiply_by_demand(series):
return demand.ix[series.name].apply(lambda x: x*series).stack()
df = areas.apply(multiply_by_demand).unstack(0)
print(df)
yields
Com Ind
Elec Water Elec Water
0 8 4 36 12
1 16 8 45 15
2 24 12 54 18
How this works:
First, look at what happens when we call areas.apply(foo)
. foo
gets passed the columns of areas
one-by-one:
def foo(series):
print(series)
In [226]: areas.apply(foo)
0 1
1 2
2 3
Name: Com, dtype: int64
0 4
1 5
2 6
Name: Ind, dtype: int64
So suppose series
is one such column:
In [230]: series = areas['Com']
In [231]: series
Out[231]:
0 1
1 2
2 3
Name: Com, dtype: int64
We can muliply demand by this series this way:
In [229]: demand.ix['Com'].apply(lambda x: x*series)
Out[229]:
0 1 2
Elec 8 16 24
Water 4 8 12
This has half the numbers we want, but not in the form we want them.
Now apply
needs to return a Series
, not a DataFrame
. One way to turn a DataFrame
into a Series
is to use stack
. Look at what happens if we
stack
this DataFrame. The columns become a new level of the index:
In [232]: demand.ix['Com'].apply(lambda x: x*areas['Com']).stack()
Out[232]:
Elec 0 8
1 16
2 24
Water 0 4
1 8
2 12
dtype: int64
So, using this as the return value of multiply_by_demand
, we get:
In [235]: areas.apply(multiply_by_demand)
Out[235]:
Com Ind
Elec 0 8 36
1 16 45
2 24 54
Water 0 4 12
1 8 15
2 12 18
Now we want the first level of the index to become columns. This can be done with unstack
:
In [236]: areas.apply(multiply_by_demand).unstack(0)
Out[236]:
Com Ind
Elec Water Elec Water
0 8 4 36 12
1 16 8 45 15
2 24 12 54 18
Per the request in the comments, here is the pivot_table
solution:
import pandas as pd
areas = pd.DataFrame({'Com':[1,2,3], 'Ind':[4,5,6]})
demand = pd.DataFrame({'Water':[4,3],
'Elec':[8,9]}, index=['Com', 'Ind'])
areas = pd.DataFrame({'area': areas.stack()})
areas.index.names = ['Edge', 'Type']
both = areas.reset_index(1).join(demand, on='Type')
both['Elec'] = both['Elec'] * both['area']
both['Water'] = both['Water'] * both['area']
both.reset_index(inplace=True)
both = both.pivot_table(values=['Elec', 'Water'], rows='Edge', cols='Type')
both = both.reorder_levels([1,0], axis=1)
both = both.reindex(columns=both.columns[[0,2,1,3]])
print(both)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With