Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas .loc with Tuple column names

I am currently working with a panda that uses tuples for column names. When attempting to use .loc as I would for normal columns the tuple names cause it to error out.

Test code is below:

import pandas as pd
import numpy as np
df1 = pd.DataFrame(np.random.randn(6,4),
                   columns=[('a','1'), ('b','2'), ('c','3'), 'nontuple'])
df1.loc[:3, 'nontuple']
df1.loc[:3, ('c','3')]

The second line works as expected and displays the column 'non tuple' from 0:3. The third line does not work and instead gives the error:

KeyError: "None of [('c', '3')] are in the [columns]

Any idea how to resolve this issue short of not using tuples as column names?

Also, I have found that the code below works even though the .loc doesn't:

df1.ix[:3][('c','3')]
like image 291
ezl33 Avatar asked May 11 '16 20:05

ezl33


1 Answers

Documenation

access by tuple, returns DF:

In [508]: df1.loc[:3, [('c', '3')]]
Out[508]:
     (c, 3)
0  1.433004
1 -0.731705
2 -1.633657
3  0.565320

access by non-tuple column, returns series:

In [514]: df1.loc[:3, 'nontuple']
Out[514]:
0    0.783621
1    1.984459
2   -2.211271
3   -0.532457
Name: nontuple, dtype: float64

access by non-tuple column, returns DF:

In [517]: df1.loc[:3, ['nontuple']]
Out[517]:
   nontuple
0  0.783621
1  1.984459
2 -2.211271
3 -0.532457

access any column by it's number, returns series:

In [515]: df1.iloc[:3, 2]
Out[515]:
0    1.433004
1   -0.731705
2   -1.633657
Name: (c, 3), dtype: float64

access any column(s) by it's number, returns DF:

In [516]: df1.iloc[:3, [2]]
Out[516]:
     (c, 3)
0  1.433004
1 -0.731705
2 -1.633657

NOTE: pay attention at the differences between .loc[] and .iloc[] - they are filtering rows differently!

this works like Python's slicing:

In [531]: df1.iloc[0:2]
Out[531]:
     (a, 1)    (b, 2)    (c, 3)  nontuple
0  0.650961 -1.130000  1.433004  0.783621
1  0.073805  1.907998 -0.731705  1.984459

this includes right index boundary:

In [532]: df1.loc[0:2]
Out[532]:
     (a, 1)    (b, 2)    (c, 3)  nontuple
0  0.650961 -1.130000  1.433004  0.783621
1  0.073805  1.907998 -0.731705  1.984459
2 -1.511939  0.167122 -1.633657 -2.211271
like image 77
MaxU - stop WAR against UA Avatar answered Nov 08 '22 19:11

MaxU - stop WAR against UA