I have two pandas DataFrames - weight has a simple Index on a Land Use columns. concentration has a MultiIndex on Land Use and Parameter.
import pandas
from io import StringIO
conc_string = StringIO("""\
Land Use,Parameter,1E,1N,1S,2
Airfield,BOD5 (mg/l),0.418,0.118,0.226,1.063
Airfield,Ortho P (mg/l),0.002,0.001,0.001,0.002
Airfield,TSS (mg/l),1.773,11.47,0.862,0.183
Airfield,Zn (mg/l),0.001,0.001,4.95E-05,0.001
"Commercial",BOD5 (mg/l),0.036,0.0419,,0.315
"Commercial",Cu (mg/l),4.37E-05,7.34E-05,,0.00039
"Commercial",O&G (mg/l),0.0385,0.127,,0.263
Open Space,TSS (mg/l),0.371,3.01,1.209,0.147
Open Space,Zn (mg/l),0.0127,0.0069,0.0132,0.007
"Parking Lot",BOD5 (mg/l),0.924,0.0668,2.603,3.19
"Parking Lot",O&G (mg/l),1.02,0.149,1.347,1.88
"Rooftops",BOD5 (mg/l),0.135,1.00,0.0562,0.310""")
weight_string = StringIO("""\
Land Use,1E,1N,1S,2
Airfield,0.511,0.0227,0.0616,0.394
Commercial,0.0005,0.1704,0,0.1065
Open Space,0.0008,0.005,0.0002,0.0004
"Parking Lot",0.33,0.514,0.252,0.171
Rooftops,0.081,0.028,8.50E-05,0.003""")
concentration = pandas.read_csv(conc_string, index_col=[0,1])
weight = pandas.read_csv(weight_string, index_col=0)
In this case, the columns (1E, 1N, 1S, and 2) are drainage basins.
What I would like to do is divide all of the concentrations independent of Parameter by the weights where the basin (column names) and Land Use.
I'm not having much luck here. concentration / weight certainly does't work. I'm not having much luck stacking the dataframes and joining either
wstk = pandas.DataFrame(weight.stack())
wstk.index.names = ['Land Use', 'Basin']
wstk.rename(columns={0:'weight'}, inplace=True)
cstk = pandas.DataFrame(concentration.stack())
cstk.index.names = ['Land Use', 'Parameter', 'Basin']
cstk.rename(columns={0:'concentration'}, inplace=True)
wstk.join(cstk, on=['Land Use', 'Basin']) # fails
cstk.join(wstk, on=['Land Use', 'Basin']) # fails
The last two lines don't raise an error when I leave off the on kwarg, but return NaN results for the joined column. They also fail if I drop the index on both stacked DataFrames (e.g., do wstk.reset_index(inplace=True) before the join).
Any suggestions?
Use the DataFrame div method and pass matchkey for the multi-index you want to broadcast across:
From the documentation for div:
level : int or name
Broadcast across a level, matching Index values on the
passed MultiIndex level
In [39]: concentration.div(weight, level='Land Use')
Out[39]:
1E 1N 1S 2
Land Use Parameter
Airfield BOD5 (mg/l) 0.818004 5.198238 3.668831 2.697970
Ortho P (mg/l) 0.003914 0.044053 0.016234 0.005076
TSS (mg/l) 3.469667 505.286344 13.993506 0.464467
Zn (mg/l) 0.001957 0.044053 0.000804 0.002538
Commercial BOD5 (mg/l) 72.000000 0.245892 NaN 2.957746
Cu (mg/l) 0.087400 0.000431 NaN 0.003662
O&G (mg/l) 77.000000 0.745305 NaN 2.469484
Open Space TSS (mg/l) 463.750000 602.000000 6045.000000 367.500000
Zn (mg/l) 15.875000 1.380000 66.000000 17.500000
Parking Lot BOD5 (mg/l) 2.800000 0.129961 10.329365 18.654971
O&G (mg/l) 3.090909 0.289883 5.345238 10.994152
Rooftops BOD5 (mg/l) 1.666667 35.714286 661.176471 103.333333
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With