Consider this dataFrame:
df = pd.DataFrame({u'A': {2.0: 2.2,
7.0: 1.4,
8.0: 1.4,
9.0: 2.2}, u'B': {2.0: 7.2,
7.0: 6.3,
8.0: 4.4,
9.0: 5.0}})
Which looks like this:
A B
2 2.2 7.2
7 1.4 6.3
8 1.4 4.4
9 2.2 5.0
I'd like to get indices with label 2
and 7
(numbers, not strings)
df.loc[[2, 7]]
gives an error!
IndexError: indices are out-of-bounds
However, df.loc[7]
and df.loc[2]
work fine and as expected. Also, if I define the dataframe index with strings instead of numbers:
df2 = pd.DataFrame({u'A': {'2': 2.2,
'7': 1.4,
'8': 1.4,
'9': 2.2},
u'B': {'2': 7.2,
'7': 6.3,
'8': 4.4,
'9': 5.0}})
df2.loc[['2', '8']]
it works fine.
This is not the behavior I expected from df.loc
(is it a bug or just a gotcha?)
Can I pass an array of numbers as label indices and not just positions?
I can convert all indices to strings and then operate with .loc
but it would be very inconvenient for the rest of my code.
Thanks for your time!
Pandas provide a unique method to retrieve rows from a Data frame. DataFrame. loc[] method is a method that takes only index labels and returns row or dataframe if the index label exists in the caller data frame.
.iloc[] is primarily integer position based (from 0 to length-1 of the axis), but may also be used with a boolean array.
By using df[], loc[], iloc[] and get() you can select multiple columns from pandas DataFrame.
Select Rows & Columns by Name or Index in Pandas DataFrame using [ ], loc & iloc. Indexing in Pandas means selecting rows and columns of data from a Dataframe.
The Pandas loc method enables you to select data from a Pandas DataFrame by label. It allows you to “ loc ate” data in a DataFrame. That’s where we get the name loc [].
Indexing and Selecting Data with Pandas. Indexing in Pandas : Indexing in pandas means simply selecting particular rows and columns of data from a DataFrame. Indexing could mean selecting all the rows and some of the columns, some of the rows and all of the columns, or some of each of the rows and columns.
Select Rows & Columns by Name or Index in Pandas DataFrame using [ ], loc & iloc. Indexing in Pandas means selecting rows and columns of data from a Dataframe. It can be selecting all the rows and the particular number of columns, a particular number of rows, and all the columns or a particular number of rows and columns each.
pandas provides a suite of methods in order to have purely label based indexing. This is a strict inclusion based protocol. Every label asked for must be in the index, or a KeyError will be raised. When slicing, both the start bound AND the stop bound are included, if present in the index.
This is a bug in 0.12. Version 0.13 fixes this (IOW, label selection, whether number or string should work when you pass a list).
You could do this (uses an internal method though):
In [10]: df.iloc[df.index.get_indexer([2,7])]
Out[10]:
A B
2 2.2 7.2
7 1.4 6.3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With