I have a dataframe that looks like this:
one two three
1 2 1 2 1 2
X Y X Y X Y X Y X Y X Y
a 0.3 -0.6 -0.3 -0.2 1.5e+00 0.3 -1.0e+00 1.2 0.6 -9.8e-02 -0.4 0.4
b -0.6 -0.4 -1.1 2.3 -7.4e-02 0.7 -7.4e-02 -0.5 -0.3 -6.8e-01 1.1 -0.1
How do I divide all elements of df
by df["three"]
?
I tried df.div(df["three"],level=[1,2])
with no luck.
Here's a one liner.
df / pd.concat( [ df.three ] * 3, axis=1 ).values
And here's another way that is a little less concise but may be more readable.
df2 = df.copy()
for c in df.columns.levels[0]:
df2[c] = df[c] / df['three']
And finally, here is a longer solution with more of an explanation. I did it this way originally before realizing there were better ways. But I'll keep it here as it is more informative about what is happening behind the scenes on an operation like this (though possibly overkill).
First off, multi-index doesn't copy well, so I'll create a sample dataframe that is pretty similar.
np.random.seed(123)
tuples = list(zip(*[['one', 'one', 'two', 'two', 'three', 'three'],
['foo', 'bar', 'foo', 'bar', 'foo', 'bar']]))
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
df = pd.DataFrame(np.random.randn(3, 6), index=['A', 'B', 'C'], columns=index)
first one two three
second foo bar foo bar foo bar
A -1.085631 0.997345 0.282978 -1.506295 -0.578600 1.651437
B -2.426679 -0.428913 1.265936 -0.866740 -0.678886 -0.094709
C 1.491390 -0.638902 -0.443982 -0.434351 2.205930 2.186786
The simplest approach is likely to expand the denominator by 3 so that it will match the dimension of the full dataframe. Alternatively you could loop over the columns but then you have to re-combine them afterwards which may not be as easy as you'd think in the case of a multi-index. So broadcast column 'three' like this.
denom = pd.concat( [df['three']]*3, axis=1 )
denom = pd.DataFrame( denom.values, columns=df.columns, index=df.index )
first one two three
second foo bar foo bar foo bar
A -0.578600 1.651437 -0.578600 1.651437 -0.578600 1.651437
B -0.678886 -0.094709 -0.678886 -0.094709 -0.678886 -0.094709
C 2.205930 2.186786 2.205930 2.186786 2.205930 2.186786
The first 'denom' line just expands the 'three' column to be the same shape as the existing dataframe. The second 'denom' is necessary to match the row and column indices. Now you can just write a normal divide operation.
df / denom
first one two three
second foo bar foo bar foo bar
A 1.876305 0.603926 -0.489074 -0.912112 1 1
B 3.574501 4.528744 -1.864725 9.151619 1 1
C 0.676082 -0.292165 -0.201267 -0.198625 1 1
A quick note on the one liner relative to this longer solution. The values
in the one liner converts from a dataframe to an array, which has the convenient side effect of erasing the row and column indices. Alternatively in this longer solution I explicitly conform the indices. Depending on your situation, either approach could be a better way to go.
Another approach is to use unstack()
to obtain a pd.Series
, divide by that, and then restore the structure using stack()
.
First I'll create an array with the same structure as your example:
elems = itertools.count()
df = pd.DataFrame(collections.OrderedDict(((a, b, c), {'a': next(elems), 'b': next(elems)}) for a in ['one', 'two', 'three'] for b in ['1', '2'] for c in ['X', 'Y']))
This gives:
one two three
1 2 1 2 1 2
X Y X Y X Y X Y X Y X Y
a 0 2 4 6 8 10 12 14 16 18 20 22
b 1 3 5 7 9 11 13 15 17 19 21 23
Then to do the division:
df_stacked = df.stack(level=[1, 2])
result_stacked = df_stacked.div(df_stacked['three'], axis=0)
result = result_stacked.unstack(level=[-2, -1])
It's necessary to reindex the result, since it seems like unstack
/stack
can change the ordering.
result = result.reindex_like(df)
This gives:
one two three
1 2 1 2 1 2
X Y X Y X Y X Y X Y X Y
a 0.000000 0.111111 0.200000 0.272727 0.500000 0.555556 0.600000 0.636364 1.0 1.0 1.0 1.0
b 0.058824 0.157895 0.238095 0.304348 0.529412 0.578947 0.619048 0.652174 1.0 1.0 1.0 1.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With