I have a data frame containing many columns, including Quarter
and CustomerID
. In this I want to identify the unique combinations of Quarter
and CustomerID
.
For eg:
masterdf <- read.csv(text = "
Quarter, CustomerID, ProductID
2009 Q1, 1234, 1
2009 Q1, 1234, 2
2009 Q2, 1324, 3
2009 Q3, 1234, 4
2009 Q3, 1234, 5
2009 Q3, 8764, 6
2009 Q4, 5432, 7")
What i want is:
FilterQuarter UniqueCustomerID
2009 Q1 1234
2009 Q2 1324
2009 Q3 8764
2009 Q3 1234
2009 Q4 5432
How to do this in R? I tried unique
function but it is not working as i want.
The long comments under the OP are getting hard to follow. You are looking for duplicated
as pointed out by @RomanLustrik. Use it to subset your original data.frame
like this...
masterdf[ ! duplicated( masterdf[ c("Quarter" , "CustomerID") ] ) , ]
# Quarter CustomerID
#1 2009 Q1 1234
#3 2009 Q2 1324
#4 2009 Q3 1234
#6 2009 Q3 8764
#7 2009 Q4 5432
Another simple way is to use SQL
queries from R, check the codes below.
This assumes masterdf is the name of the original file...
library(sqldf)
sqldf("select Quarter, CustomerID from masterdf group by 1,2")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With