Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get the intersection of two dataframes?

I have two dataframes of similar format:

df1 = DataFrame({'a':[0,1,2,3,4], 'b':['q','r','s','t','u']})
df1

    a   b
0   0   q
1   1   r
2   2   s
3   3   t
4   4   u

df2 = DataFrame({'a':[4,3,2,1,999], 'b':['u','r','s','t','u']})
df2

    a   b
0   4   u
1   3   r
2   2   s
3   1   t
4   999 u

I would like to get a new dataframe that has rows that appear in both of these (ignoring the index). So the above example gives a dataframe

    a   b
0   4   u
1   2   s

How do I get this intersection?

like image 831
theQman Avatar asked Mar 26 '15 21:03

theQman


People also ask

How do you find the intersection of two data frames?

Intersection of Two data frames in Pandas can be easily calculated by using the pre-defined function merge() . This function takes both the data frames as argument and returns the intersection between them.

How do you find the intersection of two DataFrames Pyspark?

Intersect of two dataframe in pyspark can be accomplished using intersect() function. Intersection in Pyspark returns the common rows of two or more dataframe. Intersect removes the duplicate after combining. Intersect all returns the common rows from the dataframe with duplicate.

How do you find the intersection of two series in Pandas?

Import the Pandas and NumPy modules. Create 2 Pandas Series . Find the union of the series using the union1d() method. Find the intersection of the series using the intersect1d() method.


1 Answers

You can just perform a merge, this will use all columns and the default type of merge is inner so values must be present in both dfs:

In [71]:

df1.merge(df2)
Out[71]:
   a  b
0  2  s
1  4  u
like image 78
EdChum Avatar answered Oct 08 '22 12:10

EdChum