Pandas - intersection of two data frames based on column entries

Question

Suppose I have two DataFrames like so:

>>dfA              S                      T            prob 0        ! ! !                ! ! ! !   8.1623999e-05 1        ! ! !                ! ! ! "   0.00354090007 2        ! ! !                ! ! ! .   0.00210241997 3        ! ! !                ! ! ! ?  6.55684998e-05 4        ! ! !                  ! ! !     0.203119993 5        ! ! !                ! ! ! ”  6.62070015e-05 6        ! ! !                    ! !   0.00481862016 7        ! ! !                      !    0.0274260994 8        ! ! !                " ! ! !  7.99940026e-05 9        ! ! !                    " !  1.51188997e-05 10       ! ! !                      "  8.50678989e-05  >>dfB              S                      T                                 knstats 0        ! ! !                ! ! ! !                 knstats=2,391,104,64,25 1        ! ! !                ! ! ! "                    knstats=4,391,6,64,2 2        ! ! !                ! ! ! .                    knstats=4,391,5,64,2 3        ! ! !                ! ! ! ?                    knstats=1,391,4,64,4 4        ! ! !                  ! ! !               knstats=220,391,303,64,55 5        ! ! !                    ! !               knstats=16,391,957,64,115 6        ! ! !                      !              knstats=28,391,5659,64,932 7        ! ! !                " ! ! !                    knstats=2,391,2,64,1 8        ! ! !                    " !                  knstats=1,391,37,64,13 9        ! ! !                      "     knstats=2,391,1.11721e+06,64,180642 10       ! ! !                    . "           knstats=2,391,120527,64,20368

I want to create a new DataFrame which is composed of the rows which have matching "S" and "T" entries in both matrices, along with the prob column from dfA and the knstats column from dfB. The result should look something like the following, and it is important that the order is the same:

             S                      T            prob                             knstats 0        ! ! !                ! ! ! !   8.1623999e-05             knstats=2,391,104,64,25 1        ! ! !                ! ! ! "   0.00354090007                knstats=4,391,6,64,2 2        ! ! !                ! ! ! .   0.00210241997                knstats=4,391,5,64,2 3        ! ! !                ! ! ! ?  6.55684998e-05                knstats=1,391,4,64,4 4        ! ! !                  ! ! !     0.203119993           knstats=220,391,303,64,55 5        ! ! !                    ! !   0.00481862016           knstats=16,391,957,64,115 6        ! ! !                      !    0.0274260994          knstats=28,391,5659,64,932 7        ! ! !                " ! ! !  7.99940026e-05                knstats=2,391,2,64,1 8        ! ! !                    " !  1.51188997e-05              knstats=1,391,37,64,13 9        ! ! !                      "  8.50678989e-05 knstats=2,391,1.11721e+06,64,180642

user308827 · Accepted Answer

You can merge them so:

s1 = pd.merge(dfA, dfB, how='inner', on=['S', 'T'])

To drop NA rows:

s1.dropna(inplace=True)

Pandas - intersection of two data frames based on column entries

Tags:

Bib

1 Answers

user308827

Recent Activity

Donate For Us

Pandas - intersection of two data frames based on column entries

Tags:

Bib

1 Answers

user308827

Related questions

Recent Activity

Donate For Us