The following is my data,
data
date number value
2016-05-05 1 5
2016-05-05 1 6
2016-05-06 2 7
2016-05-06 2 8
2016-05-07 3 9
2016-05-08 4 10
2016-05-09 5 11
When I use the following command,
data %>% groupby(date, number) %>% summarize(count = n())
I get the following,
date number count
2016-05-05 1 2
2016-05-06 2 2
2016-05-07 3 1
2016-05-08 4 1
2016-05-09 5 1
Now I want to filter out the entries corresponding to the counts greater than 1. I want to remove the combination entries which has count greater than 1. My output should be like the following,
data
date number value
2016-05-07 3 9
2016-05-08 4 10
2016-05-09 5 11
where the first four entries, since it has count greater than 1 , has been filtered out. Can anybody help me in doing this? Or give some idea related to it?
We can use filter
after grouping by 'date', 'number' and check whether the number of rows (n()
) is equal to 1 and keep those rows with the filter
command.
library(dplyr)
data %>%
group_by(date, number) %>%
filter(n() ==1)
# date number value
# <chr> <int> <int>
#1 2016-05-07 3 9
#2 2016-05-08 4 10
#3 2016-05-09 5 11
Just to provide some alternatives using data.table
library(data.table)
setDT(data)[, if(.N == 1) .SD , .(date, number)]
Or with base R
data[with(data, ave(number, number, date, FUN = length) ==1),]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With