Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Comparing common strings in two pandas dataframe columns

Tags:

python

pandas

I have a pandas data frame as follows:

coname1        coname2
Apple          [Microsoft, Apple, Google]
Yahoo          [American Express, Jet Blue]
Gap Inc       [American Eagle, Walmart, Gap Inc]

I want to create a new column that flags whether the string in coname1 is contained in conames. So, from the above example, the dataframe would now be:

coname1        coname2                               isin
Apple          [Microsoft, Apple, Google]            True
Yahoo          [American Express, Jet Blue]          False
Gap Inc       [American Eagle, Walmart, Gap Inc]     True
like image 412
Christine Tan Avatar asked May 15 '15 01:05

Christine Tan


People also ask

How do I compare values in two columns in pandas?

By using the Where() method in NumPy, we are given the condition to compare the columns. If 'column1' is lesser than 'column2' and 'column1' is lesser than the 'column3', We print the values of 'column1'. If the condition fails, we give the value as 'NaN'. These results are stored in the new column in the dataframe.

How do I find similar columns in pandas?

To find duplicate columns we need to iterate through all columns of a DataFrame and for each and every column it will search if any other column exists in DataFrame with the same contents already. If yes then that column name will be stored in the duplicate column set.

How do I compare two DataFrames based on a column?

The compare method in pandas shows the differences between two DataFrames. It compares two data frames, row-wise and column-wise, and presents the differences side by side. The compare method can only compare DataFrames of the same shape, with exact dimensions and identical row and column labels.


1 Answers

set up frame:

df =pd.DataFrame({'coname1':['Apple','Yahoo','Gap Inc'],
          'coname2':[['Microsoft', 'Apple', 'Google'],['American Express', 'Jet Blue'],
                     ['American Eagle', 'Walmart', 'Gap Inc']]})

try this:

df['isin'] =df.apply(lambda row: row['coname1'] in row['coname2'],axis=1)
like image 166
JAB Avatar answered Sep 30 '22 02:09

JAB