if <code>df['col']='a','b','c'</code> and <code>df2['col']='a123','b456','d789'</code> how do I create <code>df2['is_contained']='a','b','no_match'</code> where if values from <code>df['col']</code> are found within values from <code>df2['col']</code> the <code>df['col']</code> value is returned and if no match is found, 'no_match' is returned? Also I don't expect there to be multiple matches, but in the unlikely case there are, I'd want to return a string like 'Multiple Matches'.

With this toy data set, we want to add a new column to <code>df2</code> which will contain <code>no_match</code> for the first three rows, and the last row will contain the value <code>'d'</code> due to the fact that that row's <code>col</code> value (the letter <code>'a'</code>) appears in df1. <pre class="prettyprint"><code>import numpy as np import pandas as pd import matplotlib.pyplot as plt df1 = pd.DataFrame({'col': ['a', 'b', 'c', 'd']}) df2 = pd.DataFrame({'col': ['a123','b456','d789', 'a']}) </code></pre> In other words, values from <code>df1</code> should be used to populate this new column in <code>df2</code> only when a row's <code>df2['col']</code> value appears somewhere in <code>df1['col']</code>. <pre class="prettyprint"><code>In [2]: df1 Out[2]: col 0 a 1 b 2 c 3 d In [3]: df2 Out[3]: col 0 a123 1 b456 2 d789 3 a </code></pre> If this is the right way to understand your question, then you can do this with pandas <code>isin</code>: <pre class="prettyprint"><code>In [4]: df2.col.isin(df1.col) Out[4]: 0 False 1 False 2 False 3 True Name: col, dtype: bool </code></pre> This evaluates to <code>True</code> only when a value in <code>df2.col</code> is also in <code>df1.col</code>. Then you can use <code>np.where</code> which is more or less the same as <code>ifelse</code> in R if you are familiar with R at all. <pre class="prettyprint"><code>In [5]: np.where(df2.col.isin(df1.col), df1.col, 'NO_MATCH') Out[5]: 0 NO_MATCH 1 NO_MATCH 2 NO_MATCH 3 d Name: col, dtype: object </code></pre> For rows where a <code>df2.col</code> value appears in <code>df1.col</code>, the value from <code>df1.col</code> will be returned for the given row index. In cases where the <code>df2.col</code> value is not a member of <code>df1.col</code>, the default <code>'NO_MATCH'</code> value will be used.

You must first guarantee that the indexes match. To simplify, I'll show as if the columns where in the same dataframe. The trick is to use the apply method in the columns axis: <pre class="prettyprint"><code>df = pd.DataFrame({'col1': ['a', 'b', 'c', 'd'], 'col2': ['a123','b456','d789', 'a']}) df['contained'] = df.apply(lambda x: x.col1 in x.col2, axis=1) df col1 col2 contained 0 a a123 True 1 b b456 True 2 c d789 False 3 d a False </code></pre>

Check if Pandas column contains value from another column

if df['col']='a','b','c' and df2['col']='a123','b456','d789' how do I create df2['is_contained']='a','b','no_match' where if values from df['col'] are found within values from df2['col'] the df['col'] value is returned and if no match is found, 'no_match' is returned? Also I don't expect there to be multiple matches, but in the unlikely case there are, I'd want to return a string like 'Multiple Matches'.

How do you check if a value in a column exists in another column in pandas?

You can use the MATCH() function to check if the values in column A also exist in column B. MATCH() returns the position of a cell in a row or column. The syntax for MATCH() is =MATCH(lookup_value, lookup_array, [match_type]) . Using MATCH, you can look up a value both horizontally and vertically.

How do I compare two column values in pandas?

This example demonstrates how to use the equals() method to compare two columns and return the result in the third column. DataFrame. equals(other) is the syntax. This method checks if two columns have the same elements.

How do you check if a column contains a value?

When you need to check if one value exists in a column in Excel, you can do this using the MATCH function or VLOOKUP.

How do I know if two columns match in pandas?

To find the positions of two matching columns, we first initialize a pandas dataframe with two columns of city names. Then we use where() of numpy to compare the values of two columns. This returns an array that represents the indices where the two columns have the same value.

With this toy data set, we want to add a new column to df2 which will contain no_match for the first three rows, and the last row will contain the value 'd' due to the fact that that row's col value (the letter 'a') appears in df1.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt


df1 = pd.DataFrame({'col': ['a', 'b', 'c', 'd']})
df2 = pd.DataFrame({'col': ['a123','b456','d789', 'a']})

In other words, values from df1 should be used to populate this new column in df2 only when a row's df2['col'] value appears somewhere in df1['col'].

In [2]: df1
Out[2]:
  col
0   a
1   b
2   c
3   d

In [3]: df2
Out[3]:
    col
0  a123
1  b456
2  d789
3     a

If this is the right way to understand your question, then you can do this with pandas isin:

In [4]: df2.col.isin(df1.col)
Out[4]:
0    False
1    False
2    False
3     True
Name: col, dtype: bool

This evaluates to True only when a value in df2.col is also in df1.col.

Then you can use np.where which is more or less the same as ifelse in R if you are familiar with R at all.

In [5]:     np.where(df2.col.isin(df1.col), df1.col, 'NO_MATCH')
Out[5]:
0    NO_MATCH
1    NO_MATCH
2    NO_MATCH
3           d
Name: col, dtype: object

For rows where a df2.col value appears in df1.col, the value from df1.col will be returned for the given row index. In cases where the df2.col value is not a member of df1.col, the default 'NO_MATCH' value will be used.

You must first guarantee that the indexes match. To simplify, I'll show as if the columns where in the same dataframe. The trick is to use the apply method in the columns axis:

df = pd.DataFrame({'col1': ['a', 'b', 'c', 'd'],
                   'col2': ['a123','b456','d789', 'a']})
df['contained'] = df.apply(lambda x: x.col1 in x.col2, axis=1)
df
  col1  col2  contained
0    a  a123       True
1    b  b456       True
2    c  d789      False
3    d     a      False

Check if Pandas column contains value from another column

Tags:

python

pandas

ChrisArmstrong

People also ask

2 Answers

hernamesbarbara

neves

Recent Activity

Donate For Us

Check if Pandas column contains value from another column

Tags:

python

pandas

ChrisArmstrong

People also ask

2 Answers

hernamesbarbara

neves

Related questions

Recent Activity

Donate For Us