Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find unique field values from two columns in data frame

Tags:

r

unique

I have a data frame containing many columns, including Quarter and CustomerID. In this I want to identify the unique combinations of Quarter and CustomerID.

For eg:

masterdf <- read.csv(text = "
    Quarter,  CustomerID, ProductID
    2009 Q1,    1234,     1
    2009 Q1,    1234,     2
    2009 Q2,    1324,     3
    2009 Q3,    1234,     4
    2009 Q3,    1234,     5
    2009 Q3,    8764,     6
    2009 Q4,    5432,     7")

What i want is:

FilterQuarter     UniqueCustomerID
2009 Q1           1234
2009 Q2           1324
2009 Q3           8764
2009 Q3           1234
2009 Q4           5432

How to do this in R? I tried unique function but it is not working as i want.

like image 916
snehal Avatar asked Aug 22 '13 05:08

snehal


2 Answers

The long comments under the OP are getting hard to follow. You are looking for duplicated as pointed out by @RomanLustrik. Use it to subset your original data.frame like this...

masterdf[ ! duplicated( masterdf[ c("Quarter" , "CustomerID") ] ) , ]
#  Quarter CustomerID
#1 2009 Q1       1234
#3 2009 Q2       1324
#4 2009 Q3       1234
#6 2009 Q3       8764
#7 2009 Q4       5432
like image 85
Simon O'Hanlon Avatar answered Oct 19 '22 22:10

Simon O'Hanlon


Another simple way is to use SQL queries from R, check the codes below. This assumes masterdf is the name of the original file...

library(sqldf)
sqldf("select Quarter, CustomerID from masterdf group by 1,2")
like image 37
Ankur Raj Avatar answered Oct 19 '22 23:10

Ankur Raj