Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: Select rows whose dictionary contains a specific key

I have a dataframe, in which one column is all dictionary. I want to select rows whose dictionary contains a given key.

>>> df = pd.DataFrame({"A": [1,2,3], "B": [{"a":1}, {"b":2}, {"c":3}]})
>>> df
   A         B
0  1  {'a': 1}
1  2  {'b': 2}
2  3  {'c': 3}
>>> df['b' in df['B']]  
# the desired result is the row with index 1. But this causes an error: KeyError: False
like image 585
Munichong Avatar asked Jan 28 '23 14:01

Munichong


1 Answers

Here is one way:

df = pd.DataFrame({"A": [1,2,3], "B": [{"a":1}, {"b":2}, {"c":3}]})

df = df[df['B'].map(lambda x: 'b' in x)]

#    A         B
# 1  2  {'b': 2}

Explanation

  • pd.Series.map accepts anonymous (lambda) functions as an argument.
  • The function takes each element of B and checks whether b is in that element, returning a Boolean series.
  • We use the natural Boolean indexing of df[bool_series] to choose the required rows.
like image 70
jpp Avatar answered Jan 31 '23 07:01

jpp