compare multiple column value together using pandas

Tags:

I know i can do like below if we are checking only two columns together.

df['flag'] = df['a_id'].isin(df['b_id'])

where df is a data frame, and a_id and b_id are two columns of the data frame. It will return True or False value based on the match. But i need to compare multiple columns together.

For example: if there are a_id , a_region, a_ip, b_id, b_region and b_ip columns. I want to compare like below,

enter image description here

a_key = df['a_id'] + df['a_region] + df['a_ip']
b_key = df['b_id'] + df['b_region] + df['b_ip']

df['flag'] = a_key.isin(b_key)

Somehow the above code is always returning False value. The output should be like below,

enter image description here

First row flag will be True because there is a match.

a_key becomes 2a10 this is match with last row of b_key (2a10)

875

asked Apr 28 '19 07:04

Sakeer

2 Answers

You were going in the right direction, just use:

a_key = df['a_id'].astype(str) + df['a_region'] + df['a_ip'].astype(str)
b_key = df['b_id'].astype(str) + df['b_region'] + df['b_ip'].astype(str)

a_key.isin(b_key)

Mine is giving below results:

0     True
1    False
2    False

119

answered Oct 30 '22 05:10

hacker315

You can use isin with DataFrame as value, but as per the docs:

If values is a DataFrame, then both the index and column labels must match

So this should work:

# Removing the prefixes from column names
df_a = df[['a_id', 'a_region', 'a_ip']].rename(columns=lambda x: x[2:])
df_b = df[['b_id', 'b_region', 'b_ip']].rename(columns=lambda x: x[2:])

# Find rows where all values are in the other
matched = df_a.isin(df_b).all(axis=1)

# Get actual rows with boolean indexing
df_a.loc[matched]

# ... or add boolean flag to dataframe
df['flag'] = matched

answered Oct 30 '22 06:10

somiandras

Related questions
                            
                                "TypeError: Singleton array cannot be considered a valid collection" using sklearn train_test_split
                            
                                TypeError: _transform() takes 2 positional arguments but 3 were given
                            
                                Array: Insert with negative index [duplicate]
                            
                                Transform a 3-column dataframe into a matrix
                            
                                how to fix - error: bad escape \u at position 0
                            
                                Unable to verify secret hash for client at REFRESH_TOKEN_AUTH
                            
                                Find gaps in pandas time series dataframe sampled at 1 minute intervals and fill the gaps with new rows
                            
                                Pyspark 2.4.0, read avro from kafka with read stream - Python
                            
                                Flask-Talisman breaks Flask-Bootstrap
                            
                                How to properly use asyncio.FIRST_COMPLETED
                            
                                Any example of Airflow FileSensor?
                            
                                Python pandas to_csv causes OSError: [Errno 22] Invalid argument
                            
                                Change values in a list using a for loop (python)
                            
                                pandas: Group by splitting string value in all rows (a column) and aggregation function
                            
                                How to enable javascript in selenium webdriver Chrome using python
                            
                                clear cache of @property methods python
                            
                                How to replace the first character alone in a string using python?
                            
                                Flask TypeError : The view function did not return a valid response. The function either returned None or ended without a return statement [duplicate]
                            
                                Simulate the assignment of function arguments to *args and **kwargs
                            
                                Can you create components in Flask/Jinja to insert in various templates

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

compare multiple column value together using pandas

Tags:

python

pandas

dataframe

excel

Sakeer

People also ask

2 Answers

hacker315

somiandras

Recent Activity

Donate For Us