In using Python Pandas on a big dataset, how can I find the index based on the value in the column, in the same row?
For example, if I have this dataset...
Column
Item 1 0
Item 2 20
Item 3 34
...
Item 1000 12
... and if I have this value 17 in one of the 1000 rows (excluding row 0) in the column, and I want to find out which one of the Item has this value 17 in the column in the same row, how can I do that?
For example, I want to find out what and where is this Item x indexed in the dataset as shown below...
Column
Item x 17
... how can I do that with Pandas, using this value 17 as reference?
To get the index value of any dataframe's column, the get loc() function can be used. To find the index, we merely supply the column label to the get_loc() function. Let's create a dataframe consisting of more than one column so we can retrieve its index location or index value.
Use pandas.DataFrame. loc[] you can get rows by index names or labels. To select the rows, the syntax is df.
In order to set index to column in pandas DataFrame use reset_index() method. By using this you can also set single, multiple indexes to a column. If you are not aware by default, pandas adds an index to each row of the pandas DataFrame.
Use boolean indexing
:
df.index[df.Column == 17]
If need excluding row 0:
df1 = df.iloc[1:]
df1.index[df1.Column == 17]
Sample:
df = pd.DataFrame({'Column': {'Item 1': 0, 'Item 2': 20, 'Item 5': 12, 'Item 3': 34, 'Item 7': 17}})
print (df)
Column
Item 1 0
Item 2 20
Item 3 34
Item 5 12
Item 7 17
print (df.index[df.Column == 17])
Index(['Item 7'], dtype='object')
print (df.index[df.Column == 17].tolist())
['Item 7']
df1 = df.iloc[1:]
print (df1)
Column
Item 2 20
Item 3 34
Item 5 12
Item 7 17
print (df1.index[df1.Column == 17].tolist())
['Item 7']
use query
df.query('Column == 17')
use index.tolist()
to get the list of items
df.query('Column == 17').index.tolist()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With