I have a Pandas dataframe with a column 'htgt' this column consists of array with numbers inside. The size of the array is not constant.  An example of the data:
11                  [16, 69]
12                  [61, 79]
13                  [10, 69]
14                      [81]
15          [12, 30, 45, 68]
16                  [10, 76]
17                   [9, 39]
18              [67, 69, 77]
How can I filter all the rows that has the number 10 for example.
You could do this by first creating a boolean index using list comprehension:
mask = [(10 in x) for x in df['htgt']]
df[mask]
Or one line if you prefer:
df.loc[[(10 in x) for x in df['htgt']]]
[output]
htgt
13  [10, 69]
16  [10, 76]
                        Don't store type list in pandas columns, it's not efficient, and it will make your data harder to interact with.  Just expand your lists to columns:
out = pd.DataFrame(df.htgt.values.tolist())
    0     1     2     3
0  16  69.0   NaN   NaN
1  61  79.0   NaN   NaN
2  10  69.0   NaN   NaN
3  81   NaN   NaN   NaN
4  12  30.0  45.0  68.0
5  10  76.0   NaN   NaN
6   9  39.0   NaN   NaN
7  67  69.0  77.0   NaN
Now you can use efficient pandas operations to find rows with 10:
out.loc[out.eq(10).any(1)]
    0     1   2   3
2  10  69.0 NaN NaN
5  10  76.0 NaN NaN
If you insist on the result being in list form, you can use stack and agg:
out.loc[out.eq(10).any(1)].stack().groupby(level=0).agg(list)
2    [10.0, 69.0]
5    [10.0, 76.0]
dtype: object
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With