Divide entire pandas multiIndex dataframe by dataframe variable

Tags:

python

pandas

I have a multi-index dataframe in the form below. How can I divide all values in the dataframe by df['three']?

          one                  two                three              
Number      1      2      3      1      2      3      1      2      3
Name                                                                 
grethe -0.299 -1.444 -0.920  1.378  0.376 -0.396  0.518 -0.816 -0.329
hans    0.493  1.183 -0.741 -0.267 -0.564  0.281  1.550  0.544 -0.892

When I try this,

>>> df.div(df['three'])

or this

>>> df / df['three']

I get this error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Anaconda\lib\site-packages\pandas\core\frame.py", line 218, in f
    return self._combine_frame(other, na_op, fill_value, level)
  File "C:\Anaconda\lib\site-packages\pandas\core\frame.py", line 3819, in _combine_frame
    this, other = self.align(other, join='outer', level=level, copy=False)
  File "C:\Anaconda\lib\site-packages\pandas\core\frame.py", line 2490, in align
    fill_axis=fill_axis)
  File "C:\Anaconda\lib\site-packages\pandas\core\frame.py", line 2521, in _align_frame
    fill_value=fill_value)
  File "C:\Anaconda\lib\site-packages\pandas\core\frame.py", line 2732, in _reindex_with_indexers
    fill_value=fill_value)
  File "C:\Anaconda\lib\site-packages\pandas\core\internals.py", line 1976, in reindex_indexer
    return self._reindex_indexer_items(new_axis, indexer, fill_value)
  File "C:\Anaconda\lib\site-packages\pandas\core\internals.py", line 2020, in _reindex_indexer_items
    return BlockManager(new_blocks, new_axes)
  File "C:\Anaconda\lib\site-packages\pandas\core\internals.py", line 1007, in __init__
    self._set_ref_locs(do_refs=True)
  File "C:\Anaconda\lib\site-packages\pandas\core\internals.py", line 1117, in _set_ref_locs
    "does not have _ref_locs set" % (block,labels))
AssertionError: cannot create BlockManager._ref_locs because block [FloatBlock: [1, 2, 3, one, one, one, three, three, three, two, two, two], 12 x 2, dtype float64] with duplicate items [Index([u'1', u'2', u'3', u'one', u'one', u'one', u'three', u'three', u'three', u'two', u'two', u'two'], dtype=object)] does not have _ref_locs set

I have also tried stacking like this, with no luck.

>>> df.stack().div(df.stack()['three']).unstack()

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Anaconda\lib\site-packages\pandas\core\frame.py", line 220, in f
    return self._combine_series(other, na_op, fill_value, axis, level)
  File "C:\Anaconda\lib\site-packages\pandas\core\frame.py", line 3860, in _combine_series
    return self._combine_match_columns(other, func, fill_value)
  File "C:\Anaconda\lib\site-packages\pandas\core\frame.py", line 3893, in _combine_match_columns
    left, right = self.align(other, join='outer', axis=1, copy=False)
  File "C:\Anaconda\lib\site-packages\pandas\core\frame.py", line 2495, in align
    fill_axis=fill_axis)
  File "C:\Anaconda\lib\site-packages\pandas\core\frame.py", line 2562, in _align_series
    right_result = other if ridx is None else other.reindex(join_index)
  File "C:\Anaconda\lib\site-packages\pandas\core\series.py", line 2643, in reindex
    takeable=takeable)
  File "C:\Anaconda\lib\site-packages\pandas\core\index.py", line 2178, in reindex
    target = MultiIndex.from_tuples(target)
  File "C:\Anaconda\lib\site-packages\pandas\core\index.py", line 1799, in from_tuples
    arrays = list(lib.tuples_to_object_array(tuples).T)
  File "inference.pyx", line 914, in pandas.lib.tuples_to_object_array (pandas\lib.c:43497)
TypeError: Expected tuple, got str

433

asked Oct 21 '13 18:10

bjornarneson

1 Answers

There is a level keyword to div/mul/add/sub that allows this type of broadcasting.

In [155]: df = DataFrame(np.random.randn(2,9),
       index=['a','b'],
       columns=MultiIndex.from_tuples([ tuple([x,y+1]) 
           for x in ['one','two','three'] for y in range(3) ]))

 In [6]: df
Out[6]: 
        one                           two                         three                    
          1         2         3         1         2         3         1         2         3
a -0.558978 -1.297585  0.150898 -1.592941  0.124235 -1.749024  1.137611 -0.389676 -1.764254
b -1.366228 -1.192569 -1.384278 -0.970848  0.943373  0.508993 -0.451004  0.335807 -0.122192

In [7]: df.div(df['three'],level=1)
Out[7]: 
        one                            two                      three      
          1         2          3         1         2         3      1  2  3
a -0.491362  3.329910  -0.085531 -1.400251 -0.318815  0.991367      1  1  1
b  3.029306 -3.551347  11.328717  2.152638  2.809269 -4.165522      1  1  1

167

answered Oct 20 '22 07:10

Jeff

Related questions
                            
                                Do I have to uninstall old python versions to update to a new one on Windows?
                            
                                Sortrows with multiple sorting keys in numpy
                            
                                How do you listen for Mediakey events under gnome 3 using python?
                            
                                install pyopencv with pip on Mac OS X
                            
                                How can I expand this Gabor patch to the size of the bounding box?
                            
                                How do I get PyDev to show pylint errors in the editor?
                            
                                matplotlib: Error in sys.exitfunc
                            
                                Detect Windows 8.1 in Python?
                            
                                Getting "Failed to build these modules: _curses _curses_panel _ssl" while installing python 2.6.5
                            
                                How to use gexiv2 in python?
                            
                                Cannot add custom request headers in PyQT4
                            
                                Django collectstatic not overwriting production files
                            
                                python: numpy list to array and vstack
                            
                                Force background of matplotlib figure to be transparent
                            
                                How to make new cells in ipython notebook markdown by default?
                            
                                Proper way to clean up a Python service -- atexit, signal, try/finally
                            
                                Plot contours for the densest region of a scatter plot
                            
                                An easy way to mock loosely defined Python dict objects
                            
                                Pythonic way of writing a library function which accepts multiple types?
                            
                                what does [...] mean as an output in python? [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With