Logic behind Pandas LOC selecting values

Question

I am trying to learn and UNDERSTAND the logic behind pandas LOC and I can not come around with an explanation for the following issue. If I want to slice the dataframe with LOC I can do it in the following way

dates=pd.date_range('20130101',periods=6)
df=pd.DataFrame(np.random.rand(6,4),index=dates,columns=list('ABCD'))

print(df.loc['20130102':'20130104',['A','B']])  (*)

All fine. What I don't understand is that if in the second part of LOC I can entry a list of columns headers ['A','B'] why can't I do the same for the rows? i.e. the following would not work

print(df.loc[['20130102','20130104'],['A','B']])

What is the logic behind? I am giving LOC two row indexes, and two columns headers, but it would not work. For the columns part a list is fine, for the rows part it's not. Moreover, in the first part of the LOC (see *) the indexes of the rows can be accessed with ":" i.e. from index1 to index2 = 'index1':'index2' but I can not do the same in part 2 of the LOC function: this would not work:

print(df.loc['20130102':'20130104',['A':'C']])

I would be glad to understand why those two showed lines of codes are not working.

Thanks.

jezrael · Accepted Answer

You need convert values of list to datetimes because DatetimeIndex, it means is necessary same types of values of list and values of index/columns of DataFrame, else KeyError:

print(df.loc[pd.to_datetime(['20130102','20130104']),['A','B']])
                   A         B
2013-01-02  0.719469  0.423106
2013-01-04  0.438572  0.059678

Selecting by first and last value of index/columns

Converting to datetimes is not necessary, because partial string indexing.

For select by range remove only list [] for selecting columns:

print(df.loc['20130102':'20130104','A':'C'])
                   A         B         C
2013-01-02  0.719469  0.423106  0.980764
2013-01-03  0.480932  0.392118  0.343178
2013-01-04  0.438572  0.059678  0.398044

Similar solution for selecting of datetimes:

print(df.loc['2013-01-02':'2013-01-04','A':'C'])
                   A         B         C
2013-01-02  0.719469  0.423106  0.980764
2013-01-03  0.480932  0.392118  0.343178
2013-01-04  0.438572  0.059678  0.398044

Combinations:

#select betwen start/end datetime and only columns A,C
print(df.loc['20130102':'20130104',['A','C']])
                   A         C
2013-01-02  0.719469  0.980764
2013-01-03  0.480932  0.343178
2013-01-04  0.438572  0.398044

#select only 20130102, 20130104 index and columns between A and C
print(df.loc[pd.to_datetime(['20130102','20130104']),'A':'C'])
                   A         B         C
2013-01-02  0.719469  0.423106  0.980764
2013-01-04  0.438572  0.059678  0.398044

JFerro · Answer

This code explains the different ways to access data (slicing) with LOC in pandas:

df=pd.DataFrame(np.random.rand(6,4),index=['row1','row2','row3','row4','row5','row6'],columns=list('ABCD'))
         A         B         C         D
row1  0.972614  0.193116  0.448413  0.731300
row2  0.135391  0.783295  0.959058  0.107872
row3  0.966703  0.742793  0.852716  0.710681
row4  0.976819  0.920898  0.665329  0.078999
row5  0.418717  0.122677  0.716004  0.977522
row6  0.101422  0.641862  0.157751  0.888720

row_range column_range:

df.loc['row1':'row3', 'A':'C']

            A         B         C
row1  0.972614  0.193116  0.448413
row2  0.135391  0.783295  0.959058
row3  0.966703  0.742793  0.852716

values_rows column_range:

df.loc[['row1','row3'], 'A':'C']
             A         B         C
row1  0.972614  0.193116  0.448413
row3  0.966703  0.742793  0.852716

row_range column_values:

df.loc['row1':'row3', ['A','C']]
            A         C
row1  0.972614  0.448413
row2  0.135391  0.959058
row3  0.966703  0.852716

Single value:

df.loc['row1','A'])

0.972614309371533

conclusion: when using a range don't put it between [] But use the [] to include a list of values.

Logic behind Pandas LOC selecting values

Tags:

python

pandas

JFerro

2 Answers

jezrael

JFerro

Recent Activity

Donate For Us

Logic behind Pandas LOC selecting values

Tags:

python

pandas

JFerro

2 Answers

jezrael

JFerro

Related questions

Recent Activity

Donate For Us