I have something similar to this <pre class="prettyprint"><code>df = pd.DataFrame(np.random.randint(2, 10, size = (5, 2))) df.index = pd.MultiIndex.from_tuples([(1, 'A'), (2, 'A'), (4, 'B'), (5, 'B'), (8, 'B')]) df.index.names = ['foo', 'bar'] df.columns = ['count1', 'count2'] df </code></pre> which gives: <pre class="prettyprint"><code> count1 count2 foo bar 1 A 6 7 2 A 2 9 4 B 6 7 5 B 4 6 8 B 5 6 </code></pre> I also have a list of totals -obtained from somewhere else- by the same 'foo' index: <pre class="prettyprint"><code>totals = pd.DataFrame([2., 1., 1., 1., 10.]) totals.index = [1, 2, 4, 5, 8] totals.index.names = ['foo'] totals </code></pre> which gives: <pre class="prettyprint"><code> 0 foo 1 2 2 1 4 1 5 1 8 10 </code></pre> How can I divide all the columns of df (count1 and count2) by the the foo number that is in totals? (hence, i need to match by the 'foo' number) I checked this question, which looks like it should do the trick, but I couldn't figure it out. I tried <pre class="prettyprint"><code>df.div(totals, axis = 0) </code></pre> and changing the level option in div, but no success. As always, thanks a lot for your time

Using <code>values</code> list from <code>totals[0]</code> works: <pre class="prettyprint"><code>df.div(totals[0].values, axis=0) </code></pre> But it doesn't take Index from <code>totals</code> into account. Don't know why this does not work: <pre class="prettyprint"><code>df.div(totals[0], level=0, axis=0) </code></pre>

Pandas division (.div) with multiindex

Tags:

python

pandas

I have something similar to this

df = pd.DataFrame(np.random.randint(2, 10, size = (5, 2)))
df.index = pd.MultiIndex.from_tuples([(1, 'A'), (2, 'A'), (4, 'B'), 
           (5, 'B'), (8, 'B')])
df.index.names = ['foo', 'bar']
df.columns = ['count1', 'count2']
df

which gives:

       count1 count2
foo bar     
1   A    6     7
2   A    2     9
4   B    6     7
5   B    4     6
8   B    5     6

I also have a list of totals -obtained from somewhere else- by the same 'foo' index:

totals = pd.DataFrame([2., 1., 1., 1., 10.])
totals.index = [1, 2, 4, 5, 8]
totals.index.names = ['foo']
totals

which gives:

How can I divide all the columns of df (count1 and count2) by the the foo number that is in totals? (hence, i need to match by the 'foo' number)

I checked this question, which looks like it should do the trick, but I couldn't figure it out.

I tried

df.div(totals, axis = 0)

and changing the level option in div, but no success.

As always, thanks a lot for your time

744

asked Oct 22 '13 16:10

cd98

2 Answers

try:

df.div(totals[0],axis='index',level='foo')

         count1  count2
foo bar                
1   A       1.0     4.5
2   A       4.0     8.0
4   B       5.0     9.0
5   B       5.0     5.0
8   B       0.9     0.5

also:

totals = pd.DataFrame([2., 1., 1., 1., 10.])
totals.index = [[1, 2, 4, 5, 8],['A', 'A', 'B', 'A', 'B']]
totals.index.names = ['foo','bar']
totals
           0
foo bar      
1   A     2.0
2   A     1.0
4   B     1.0
5   A     1.0
8   B    10.0

df[['count1','count2']].div(totals[0],axis='index')
         count1  count2
foo bar                
1   A       1.0     4.5
2   A       4.0     8.0
4   B       5.0     9.0
5   A       NaN     NaN
    B       NaN     NaN
8   B       0.9     0.5

134

answered Sep 20 '22 02:09

user8641707

Using values list from totals[0] works:

df.div(totals[0].values, axis=0)

But it doesn't take Index from totals into account. Don't know why this does not work:

df.div(totals[0], level=0, axis=0)

answered Sep 20 '22 02:09

Roman Pekar

Related questions
                            
                                Python: How to get local maxima values from 1D-array or list
                            
                                How to fix "NaN or infinity" issue for sparse matrix in python?
                            
                                Proper way to make functions extensible by the user
                            
                                How to push the for-loop down to numpy
                            
                                Python 3. Need to write to a file, check to see if a line exist, then write to the file again
                            
                                When should I use os.name vs. sys.platform vs. platform.system()? [duplicate]
                            
                                Why python uppercases all environment variables in windows
                            
                                error with reading float from two column text file into an array in Python
                            
                                Python3, Gtk3 - GtkGrid expanding
                            
                                pip install fails with FileNotFoundError: setup.py
                            
                                How to access last element of a multi-index dataframe
                            
                                Django Haystack - How to filter search results by a boolean field?
                            
                                Parsing User Defined Types Using PyArg_ParseTuple
                            
                                Speed of using append in python repeatedly
                            
                                Matplotlib - plot with a different color for certain data points
                            
                                Numpy Routine(s) to create a regular grid inside a 2d array
                            
                                Consequences of abusing nltk's word_tokenize(sent)
                            
                                Python cannot find shared library in cron
                            
                                Python Pandas: Groupby date, and accessing each group by timestamp
                            
                                Python: How can I use ggplot with a simple 2 column array?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With