Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Divide entire pandas multiIndex dataframe by dataframe variable

Tags:

python

pandas

I have a multi-index dataframe in the form below. How can I divide all values in the dataframe by df['three']?

          one                  two                three              
Number      1      2      3      1      2      3      1      2      3
Name                                                                 
grethe -0.299 -1.444 -0.920  1.378  0.376 -0.396  0.518 -0.816 -0.329
hans    0.493  1.183 -0.741 -0.267 -0.564  0.281  1.550  0.544 -0.892

When I try this,

>>> df.div(df['three'])

or this

>>> df / df['three']

I get this error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Anaconda\lib\site-packages\pandas\core\frame.py", line 218, in f
    return self._combine_frame(other, na_op, fill_value, level)
  File "C:\Anaconda\lib\site-packages\pandas\core\frame.py", line 3819, in _combine_frame
    this, other = self.align(other, join='outer', level=level, copy=False)
  File "C:\Anaconda\lib\site-packages\pandas\core\frame.py", line 2490, in align
    fill_axis=fill_axis)
  File "C:\Anaconda\lib\site-packages\pandas\core\frame.py", line 2521, in _align_frame
    fill_value=fill_value)
  File "C:\Anaconda\lib\site-packages\pandas\core\frame.py", line 2732, in _reindex_with_indexers
    fill_value=fill_value)
  File "C:\Anaconda\lib\site-packages\pandas\core\internals.py", line 1976, in reindex_indexer
    return self._reindex_indexer_items(new_axis, indexer, fill_value)
  File "C:\Anaconda\lib\site-packages\pandas\core\internals.py", line 2020, in _reindex_indexer_items
    return BlockManager(new_blocks, new_axes)
  File "C:\Anaconda\lib\site-packages\pandas\core\internals.py", line 1007, in __init__
    self._set_ref_locs(do_refs=True)
  File "C:\Anaconda\lib\site-packages\pandas\core\internals.py", line 1117, in _set_ref_locs
    "does not have _ref_locs set" % (block,labels))
AssertionError: cannot create BlockManager._ref_locs because block [FloatBlock: [1, 2, 3, one, one, one, three, three, three, two, two, two], 12 x 2, dtype float64] with duplicate items [Index([u'1', u'2', u'3', u'one', u'one', u'one', u'three', u'three', u'three', u'two', u'two', u'two'], dtype=object)] does not have _ref_locs set

I have also tried stacking like this, with no luck.

>>> df.stack().div(df.stack()['three']).unstack()

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Anaconda\lib\site-packages\pandas\core\frame.py", line 220, in f
    return self._combine_series(other, na_op, fill_value, axis, level)
  File "C:\Anaconda\lib\site-packages\pandas\core\frame.py", line 3860, in _combine_series
    return self._combine_match_columns(other, func, fill_value)
  File "C:\Anaconda\lib\site-packages\pandas\core\frame.py", line 3893, in _combine_match_columns
    left, right = self.align(other, join='outer', axis=1, copy=False)
  File "C:\Anaconda\lib\site-packages\pandas\core\frame.py", line 2495, in align
    fill_axis=fill_axis)
  File "C:\Anaconda\lib\site-packages\pandas\core\frame.py", line 2562, in _align_series
    right_result = other if ridx is None else other.reindex(join_index)
  File "C:\Anaconda\lib\site-packages\pandas\core\series.py", line 2643, in reindex
    takeable=takeable)
  File "C:\Anaconda\lib\site-packages\pandas\core\index.py", line 2178, in reindex
    target = MultiIndex.from_tuples(target)
  File "C:\Anaconda\lib\site-packages\pandas\core\index.py", line 1799, in from_tuples
    arrays = list(lib.tuples_to_object_array(tuples).T)
  File "inference.pyx", line 914, in pandas.lib.tuples_to_object_array (pandas\lib.c:43497)
TypeError: Expected tuple, got str
like image 433
bjornarneson Avatar asked Oct 21 '13 18:10

bjornarneson


People also ask

How do you slice in MultiIndex?

You can slice a MultiIndex by providing multiple indexers. You can provide any of the selectors as if you are indexing by label, see Selection by Label, including slices, lists of labels, labels, and boolean indexers. You can use slice(None) to select all the contents of that level.

How convert MultiIndex to columns in pandas?

pandas MultiIndex to ColumnsUse pandas DataFrame. reset_index() function to convert/transfer MultiIndex (multi-level index) indexes to columns. The default setting for the parameter is drop=False which will keep the index values as columns and set the new index to DataFrame starting from zero.


1 Answers

There is a level keyword to div/mul/add/sub that allows this type of broadcasting.

In [155]: df = DataFrame(np.random.randn(2,9),
       index=['a','b'],
       columns=MultiIndex.from_tuples([ tuple([x,y+1]) 
           for x in ['one','two','three'] for y in range(3) ]))

 In [6]: df
Out[6]: 
        one                           two                         three                    
          1         2         3         1         2         3         1         2         3
a -0.558978 -1.297585  0.150898 -1.592941  0.124235 -1.749024  1.137611 -0.389676 -1.764254
b -1.366228 -1.192569 -1.384278 -0.970848  0.943373  0.508993 -0.451004  0.335807 -0.122192

In [7]: df.div(df['three'],level=1)
Out[7]: 
        one                            two                      three      
          1         2          3         1         2         3      1  2  3
a -0.491362  3.329910  -0.085531 -1.400251 -0.318815  0.991367      1  1  1
b  3.029306 -3.551347  11.328717  2.152638  2.809269 -4.165522      1  1  1
like image 167
Jeff Avatar answered Oct 20 '22 07:10

Jeff