Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas eval with multi-index dataframes

Tags:

python

pandas

Consider a multi-index dataframe df:

A       bar                flux          
B       one     three       six     three
x  0.627915  0.507184  0.690787  1.166318
y  0.927342  0.788232  1.776677 -0.512259
z  1.000000  1.000000  1.000000  0.000000

I would like to use eval to substract ('bar', 'one') from ('flux', six'). Does the eval syntax support this type of index?

like image 531
Amelio Vazquez-Reina Avatar asked Feb 10 '15 00:02

Amelio Vazquez-Reina


People also ask

How do I select multiple indexes in pandas?

Using slicersYou can slice a MultiIndex by providing multiple indexers. You can provide any of the selectors as if you are indexing by label, see Selection by Label, including slices, lists of labels, labels, and boolean indexers. You can use slice(None) to select all the contents of that level.

How does pandas handle multiple index columns?

A multi-index dataframe has multi-level, or hierarchical indexing. We can easily convert the multi-level index into the column by the reset_index() method. DataFrame. reset_index() is used to reset the index to default and make the index a column of the dataframe.

What does .XS do in pandas?

Pandas DataFrame xs() Method The xs() method returns a specified section of the DataFrame.


1 Answers

You can do this without using eval by using the equivalent standard Python notation:

df['bar']['one'] - df['flux']['six']`

Take a look at this reference. Below is an example for you, based off the object in your question:

from pandas import DataFrame, MultiIndex

# Create the object
columns = [
    ('bar', 'one'),
    ('bar', 'three'),
    ('flux', 'six'),
    ('flux', 'three')
]
data    = [
    [0.627915, 0.507184, 0.690787, 1.166318],
    [0.927342, 0.788232, 1.776677, -0.512259],
    [1.000000, 1.000000, 1.000000, 0.000000]
]
index   = MultiIndex.from_tuples(columns, names=['A', 'B'])
df      = DataFrame(data, index=['x', 'y', 'z'], columns=index)

# Calculate the difference
sub = df['bar']['one'] - df['flux']['six']
print sub

# Assign that difference to a new column in the object
df['new', 'col'] = sub
print df

The corresponding result is:

A       bar                flux                 new
B       one     three       six     three       col
x  0.627915  0.507184  0.690787  1.166318 -0.062872
y  0.927342  0.788232  1.776677 -0.512259 -0.849335
z  1.000000  1.000000  1.000000  0.000000  0.000000
like image 121
James Mnatzaganian Avatar answered Oct 19 '22 09:10

James Mnatzaganian