<p>I have a pandas series with boolean entries. I would like to get a list of indices where the values are <code>True</code>.</p> <p>For example the input <code>pd.Series([True, False, True, True, False, False, False, True])</code></p> <p>should yield the output <code>[0,2,3,7]</code>.</p> <p>I can do it with a list comprehension, but is there something cleaner or faster?</p>

<h3>Using <code>Boolean Indexing</code> </h3> <pre class="prettyprint"><code>>>> s = pd.Series([True, False, True, True, False, False, False, True]) >>> s[s].index Int64Index([0, 2, 3, 7], dtype='int64') </code></pre> <p>If need a <code>np.array</code> object, get the <code>.values</code></p> <pre class="prettyprint"><code>>>> s[s].index.values array([0, 2, 3, 7]) </code></pre> <hr> <h3>Using <code>np.nonzero</code> </h3> <pre class="prettyprint"><code>>>> np.nonzero(s) (array([0, 2, 3, 7]),) </code></pre> <hr> <h3>Using <code>np.flatnonzero</code> </h3> <pre class="prettyprint"><code>>>> np.flatnonzero(s) array([0, 2, 3, 7]) </code></pre> <hr> <h3>Using <code>np.where</code> </h3> <pre class="prettyprint"><code>>>> np.where(s)[0] array([0, 2, 3, 7]) </code></pre> <hr> <h3>Using <code>np.argwhere</code> </h3> <pre class="prettyprint"><code>>>> np.argwhere(s).ravel() array([0, 2, 3, 7]) </code></pre> <hr> <h3>Using <code>pd.Series.index</code> </h3> <pre class="prettyprint"><code>>>> s.index[s] array([0, 2, 3, 7]) </code></pre> <hr> <h3>Using python's built-in <code>filter</code> </h3> <pre class="prettyprint"><code>>>> [*filter(s.get, s.index)] [0, 2, 3, 7] </code></pre> <hr> <h3>Using <code>list comprehension</code> </h3> <pre class="prettyprint"><code>>>> [i for i in s.index if s[i]] [0, 2, 3, 7] </code></pre>

Getting a list of indices where pandas boolean series is True

2 Answers

Using `Boolean Indexing`

>>> s = pd.Series([True, False, True, True, False, False, False, True]) >>> s[s].index Int64Index([0, 2, 3, 7], dtype='int64')

If need a np.array object, get the .values

>>> s[s].index.values array([0, 2, 3, 7])

Using `np.nonzero`

>>> np.nonzero(s) (array([0, 2, 3, 7]),)

Using `np.flatnonzero`

>>> np.flatnonzero(s) array([0, 2, 3, 7])

Using `np.where`

>>> np.where(s)[0] array([0, 2, 3, 7])

Using `np.argwhere`

>>> np.argwhere(s).ravel() array([0, 2, 3, 7])

Using `pd.Series.index`

>>> s.index[s] array([0, 2, 3, 7])

Using python's built-in `filter`

>>> [*filter(s.get, s.index)] [0, 2, 3, 7]

Using `list comprehension`

>>> [i for i in s.index if s[i]] [0, 2, 3, 7]

168

answered Sep 24 '22 20:09

rafaelc

As an addition to rafaelc's answer, here are the according times (from quickest to slowest) for the following setup

import numpy as np import pandas as pd s = pd.Series([x > 0.5 for x in np.random.random(size=1000)])

Using `np.where`

>>> timeit np.where(s)[0] 12.7 µs ± 77.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Using `np.flatnonzero`

>>> timeit np.flatnonzero(s) 18 µs ± 508 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Using `pd.Series.index`

The time difference to boolean indexing was really surprising to me, since the boolean indexing is usually more used.

>>> timeit s.index[s] 82.2 µs ± 38.9 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Using `Boolean Indexing`

>>> timeit s[s].index 1.75 ms ± 2.16 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

If you need a np.array object, get the .values

>>> timeit s[s].index.values 1.76 ms ± 3.1 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

If you need a slightly easier to read version <-- not in original answer

>>> timeit s[s==True].index 1.89 ms ± 3.52 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Using `pd.Series.where` <-- not in original answer

>>> timeit s.where(s).dropna().index 2.22 ms ± 3.32 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)  >>> timeit s.where(s == True).dropna().index 2.37 ms ± 2.19 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Using `pd.Series.mask` <-- not in original answer

>>> timeit s.mask(s).dropna().index 2.29 ms ± 1.43 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)  >>> timeit s.mask(s == True).dropna().index 2.44 ms ± 5.82 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Using `list comprehension`

>>> timeit [i for i in s.index if s[i]] 13.7 ms ± 40.5 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Using python's built-in `filter`

>>> timeit [*filter(s.get, s.index)] 14.2 ms ± 28.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Using `np.nonzero` <-- did not work out of the box for me

>>> timeit np.nonzero(s) ValueError: Length of passed values is 1, index implies 1000.

Using `np.argwhere` <-- did not work out of the box for me

>>> timeit np.argwhere(s).ravel() ValueError: Length of passed values is 1, index implies 1000.

answered Sep 24 '22 20:09

Christian Steinmeyer

Related questions
                            
                                Use `__dict__` or `vars()`?
                            
                                Pytorch reshape tensor dimension
                            
                                True or false output based on a probability
                            
                                from ... import OR import ... as for modules
                            
                                Foreign Key Django Model
                            
                                dict.keys()[0] on Python 3 [duplicate]
                            
                                Select only one index of multiindex DataFrame
                            
                                docker-compose not printing stdout in Python app
                            
                                Best method to delete an item from a dict [closed]
                            
                                Create a python object that can be accessed with square brackets
                            
                                How to set colors for nodes in NetworkX?
                            
                                Python decorator makes function forget that it belongs to a class
                            
                                How to print original variable's name in Python after it was returned from a function?
                            
                                Making a POST call instead of GET using urllib2
                            
                                Cutting out a portion of video - python
                            
                                How do I filter a pandas DataFrame based on value counts?
                            
                                Django vs other Python web frameworks?
                            
                                Function acting as both decorator and context manager in Python?
                            
                                Checking if particular value (in cell) is NaN in pandas DataFrame not working using ix or iloc
                            
                                Links between IPython notebooks

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Getting a list of indices where pandas boolean series is True

Tags:

python

pandas

boolean

series

boolean-indexing

James McKeown

People also ask

2 Answers

Using `Boolean Indexing`

Using `np.nonzero`

Using `np.flatnonzero`

Using `np.where`

Using `np.argwhere`

Using `pd.Series.index`

Using python's built-in `filter`

Using `list comprehension`

rafaelc

Using `np.where`

Using `np.flatnonzero`

Using `pd.Series.index`

Using `Boolean Indexing`

Using `pd.Series.where` <-- not in original answer

Using `pd.Series.mask` <-- not in original answer

Using `list comprehension`

Using python's built-in `filter`

Using `np.nonzero` <-- did not work out of the box for me

Using `np.argwhere` <-- did not work out of the box for me

Christian Steinmeyer

Recent Activity

Donate For Us

Getting a list of indices where pandas boolean series is True

Tags:

python

pandas

boolean

series

boolean-indexing

James McKeown

People also ask

2 Answers

Using Boolean Indexing

Using np.nonzero

Using np.flatnonzero

Using np.where

Using np.argwhere

Using pd.Series.index

Using python's built-in filter

Using list comprehension

rafaelc

Using np.where

Using np.flatnonzero

Using pd.Series.index

Using Boolean Indexing

Using pd.Series.where <-- not in original answer

Using pd.Series.mask <-- not in original answer

Using list comprehension

Using python's built-in filter

Using np.nonzero <-- did not work out of the box for me

Using np.argwhere <-- did not work out of the box for me

Christian Steinmeyer

Related questions

Recent Activity

Donate For Us

Using `Boolean Indexing`

Using `np.nonzero`

Using `np.flatnonzero`

Using `np.where`

Using `np.argwhere`

Using `pd.Series.index`

Using python's built-in `filter`

Using `list comprehension`

Using `np.where`

Using `np.flatnonzero`

Using `pd.Series.index`

Using `Boolean Indexing`

Using `pd.Series.where` <-- not in original answer

Using `pd.Series.mask` <-- not in original answer

Using `list comprehension`

Using python's built-in `filter`

Using `np.nonzero` <-- did not work out of the box for me

Using `np.argwhere` <-- did not work out of the box for me