Below is my DF <pre class="prettyprint"><code>df= pd.DataFrame({'col1': ['[7]', '[30]', '[0]', '[7]'], 'col2': ['[0%, 7%]', '[30%]', '[30%, 7%]', '[7%]']}) col1 col2 [7] [0%, 7%] [30] [30%] [0] [30%, 7%] [7] [7%] </code></pre> The aim is to check if col1 value is contained in col2 below is what I've tried <pre class="prettyprint"><code>df['test'] = df.apply(lambda x: str(x.col1) in str(x.col2), axis=1) </code></pre> Below is the expected output <pre class="prettyprint"><code>col1 col2 col3 [7] [0%, 7%] True [30] [30%] True [0] [30%, 7%] False [7] [7%] True </code></pre>

You can also replace the square brackets with word boundaries <code>\b</code> and use <code>re.search</code> like in <pre class="prettyprint lang-py prettyprint-override"><code>import re #... df.apply(lambda x: bool(re.search(x['col1'].replace("[",r"\b").replace("]",r"\b"), x['col2'])), axis=1) # => 0 True # 1 True # 2 False # 3 True # dtype: bool </code></pre> This will work because <code>\b7\b</code> will find a match in <code>[0%, 7%]</code> as <code>7</code> is neither preceded nor followed with letters, digits or underscores. There won't be any match found in <code>[30%, 7%]</code> as <code>\b0\b</code> does not match a zero after a digit (here, <code>3</code>).

You can extract the numbers on both columns and <code>join</code>, then check if there is at least one match per id using <code>eval</code>+<code>groupby</code>+<code>any</code>: <pre class="prettyprint"><code>(df['col2'].str.extractall('(?P<col2>\d+)').droplevel(1) .join(df['col1'].str[1:-1]) .eval('col2 == col1') .groupby(level=0).any() ) </code></pre> output: <pre class="prettyprint"><code>0 True 1 True 2 False 3 True </code></pre>

Use <code>Series.str.extractall</code> for get numbers, reshape by <code>Series.unstack</code>, so possible compare by <code>DataFrame.isin</code> with <code>DataFrame.any</code>: <pre class="prettyprint"><code>df['test'] = (df['col2'].str.extractall('(\d+)')[0].unstack() .isin(df['col1'].str.strip('[]')) .any(axis=1)) print (df) col1 col2 test 0 [7] [0%, 7%] True 1 [30] [30%] True 2 [0] [30%, 7%] False 3 [7] [7%] True </code></pre>

Check if string is in another column pandas

Below is my DF

df= pd.DataFrame({'col1': ['[7]', '[30]', '[0]', '[7]'], 'col2': ['[0%, 7%]', '[30%]', '[30%, 7%]', '[7%]']})

col1    col2    
[7]     [0%, 7%]
[30]    [30%]
[0]     [30%, 7%]
[7]     [7%]

The aim is to check if col1 value is contained in col2 below is what I've tried

df['test'] = df.apply(lambda x: str(x.col1) in str(x.col2), axis=1)

Below is the expected output

col1    col2       col3
[7]     [0%, 7%]   True
[30]    [30%]      True
[0]     [30%, 7%]  False
[7]     [7%]       True

How do you check if values in a column exist in another column pandas?

You can check if a column contains/exists a particular value (string/int), list of multiple values in pandas DataFrame by using pd. series() , in operator, pandas. series. isin() , str.

Can we use if else in pandas DataFrame?

As you work with values captured in pandas Series and DataFrames, you can use if-else statements and their logical structure to categorize and manipulate your data to reveal new insights.

You can also replace the square brackets with word boundaries \b and use re.search like in

import re
#...
df.apply(lambda x: bool(re.search(x['col1'].replace("[",r"\b").replace("]",r"\b"), x['col2'])), axis=1)
# => 0     True
#    1     True
#    2    False
#    3     True
#    dtype: bool

This will work because \b7\b will find a match in [0%, 7%] as 7 is neither preceded nor followed with letters, digits or underscores. There won't be any match found in [30%, 7%] as \b0\b does not match a zero after a digit (here, 3).

You can extract the numbers on both columns and join, then check if there is at least one match per id using eval+groupby+any:

(df['col2'].str.extractall('(?P<col2>\d+)').droplevel(1)
   .join(df['col1'].str[1:-1])
   .eval('col2 == col1')
   .groupby(level=0).any()
)

output:

0     True
1     True
2    False
3     True

One approach:

import ast

# convert to integer list
col2_lst = df["col2"].str.replace("%", "").apply(ast.literal_eval)

# check list containment
df["col3"] = [all(bi in a for bi in b)  for a, b in zip(col2_lst, df["col1"].apply( ast.literal_eval)) ]

print(df)

Output

   col1       col2   col3
0   [7]   [0%, 7%]   True
1  [30]      [30%]   True
2   [0]  [30%, 7%]  False
3   [7]       [7%]   True

Use Series.str.extractall for get numbers, reshape by Series.unstack, so possible compare by DataFrame.isin with DataFrame.any:

df['test'] = (df['col2'].str.extractall('(\d+)')[0].unstack()
                        .isin(df['col1'].str.strip('[]'))
                        .any(axis=1))
print (df)
   col1       col2   test
0   [7]   [0%, 7%]   True
1  [30]      [30%]   True
2   [0]  [30%, 7%]  False
3   [7]       [7%]   True

Check if string is in another column pandas

Tags:

python

pandas

A2N15

People also ask

Video Answer

4 Answers

Wiktor Stribiżew

mozway

Dani Mesejo

jezrael

Recent Activity

Donate For Us

Check if string is in another column pandas

Tags:

python

pandas

A2N15

People also ask

Video Answer

4 Answers

Wiktor Stribiżew

mozway

Dani Mesejo

jezrael

Related questions

Recent Activity

Donate For Us