Count the number of duplicate for a column

Question

My objective is to get a count on how many duplicate are there in a column.
So i have a column of 3516 obs. of 1 variable,
there are all dates with about 144 duplicate each from 1/4/16 to 7/3/16.
Example:(i put 1 duplicate each for example sake)
1/4/16
1/4/16
31/3/16
31/3/16
30/3/16
30/3/16
29/3/16
29/3/16
28/3/16
28/3/16
so i used the function date = count(date)
where date is my df date.
But once i execute it my date sequence is not in order anymore.
Hope someone can solve my problem.

Ronak Shah · Accepted Answer

If you want the count of number of duplicates in your column , you can use duplicated

sum(duplicated(df$V1))
#[1] 5

Assuming V1 as your column name.

EDIT

As per the update if you want the count of each data, you can use the table function which will give you exactly that

table(df$V1)

#1/4/16 28/3/16 29/3/16 30/3/16 31/3/16 
#  2       2       2       2       2

akrun · Answer

If we need to count the total number of duplicates

sum(table(df1$date)-1)
#[1] 5

Suppose, we need the count of each date, one option would be to group by 'date' and get the number of rows. This can be done with data.table.

library(data.table)
setDT(df1)[, .N, date]

Count the number of duplicate for a column

Tags:

r

Amos Ong

2 Answers

Ronak Shah

akrun

Recent Activity

Donate For Us

Count the number of duplicate for a column

Tags:

r

Amos Ong

2 Answers

Ronak Shah

akrun

Related questions

Recent Activity

Donate For Us