Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count the number of duplicate for a column

Tags:

r

My objective is to get a count on how many duplicate are there in a column.
So i have a column of 3516 obs. of 1 variable,
there are all dates with about 144 duplicate each from 1/4/16 to 7/3/16.
Example:(i put 1 duplicate each for example sake)
1/4/16
1/4/16
31/3/16
31/3/16
30/3/16
30/3/16
29/3/16
29/3/16
28/3/16
28/3/16
so i used the function date = count(date)
where date is my df date.
But once i execute it my date sequence is not in order anymore.
Hope someone can solve my problem.

like image 491
Amos Ong Avatar asked Apr 21 '16 06:04

Amos Ong


2 Answers

If you want the count of number of duplicates in your column , you can use duplicated

sum(duplicated(df$V1))
#[1] 5

Assuming V1 as your column name.

EDIT

As per the update if you want the count of each data, you can use the table function which will give you exactly that

table(df$V1)

#1/4/16 28/3/16 29/3/16 30/3/16 31/3/16 
#  2       2       2       2       2 
like image 59
Ronak Shah Avatar answered Nov 17 '22 17:11

Ronak Shah


If we need to count the total number of duplicates

sum(table(df1$date)-1)
#[1] 5

Suppose, we need the count of each date, one option would be to group by 'date' and get the number of rows. This can be done with data.table.

library(data.table)
setDT(df1)[, .N, date]
like image 33
akrun Avatar answered Nov 17 '22 17:11

akrun