Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract duplicate records in dataframe [duplicate]

Tags:

dataframe

r

I have a data frame that I want to select both rows that have duplicated values. In the example below, I want a new data frame or two separate data frames with the two records for 19 and 32 respectively.

        a <- c(8, 18, 19, 19, 20, 30, 32, 32, 58)
        b <- c(1950, 1965, 1971, 1981, 1999, 1969, 1994, 1985)
        df <- data.frame(a,b)    
        df
        a    b
        1  8 1950
        2 18 1965
        3 19 1971
        4 19 1981
        5 20 1999
        6 30 1969
        7 32 1994
        8 32 1999
        9 58 1985  

I have tried using df[duplicated(df$a), ] but this only extracts the second record that is duplicated, where I want both of them. The end goal is to subtract the years in the second column between the two records of 19 and 32.

like image 906
Harmzy15 Avatar asked May 22 '26 19:05

Harmzy15


1 Answers

We can use

df[duplicated(df$a)|duplicated(df$a, fromLast=TRUE),]
#  a    b
#3 19 1971
#4 19 1981
#7 32 1994
#8 32 1999
like image 141
akrun Avatar answered May 25 '26 08:05

akrun