unique() for more than one variable

Tags:

unique

I have the following data frame in R:

> str(df) 'data.frame':   545227 obs. of  15 variables:  $ ykod : int  93 93 93 93 93 93 93 93 93 93 ...  $ yad  : Factor w/ 42 levels "BAKUGAN","BARBIE",..: 30 30 30 30 30 30 30 30 30 30 ...  $ per  : Factor w/ 3 levels "2 AYLIK","3 AYLIK",..: 3 3 3 3 3 3 3 3 3 3 ...  $ donem: int  201101 201101 201101 201101 201101 201101 201101 201101 201101 201101 ...  $ sayi : int  201101 201101 201101 201101 201101 201101 201101 201101 201101 201101 ...  $ mkod : int  4 5 9 11 12 18 20 22 25 26 ...  $ mad  : Factor w/ 10464 levels "   Defne Market          ",..: 405 8075 9710 10145 9297 7973 2542 3892 2759 5769 ...  $ mtip : Factor w/ 29 levels "Abone Bürosu                                      ",..: 2 20 20 2 2 2 2 2 2 2 ...  $ kanal: Factor w/ 2 levels "OB","SS": 2 2 2 2 2 2 2 2 2 2 ...  $ bkod : int  110565 110565 110565 110565 110565 110565 110565 110565 110565 110565 ...  $ bad  : Factor w/ 212 levels "4. Levent","500 Evler",..: 167 167 167 167 167 167 167 167 167 167 ...  $ bolge: Factor w/ 12 levels "Adana Şehiriçi",..: 7 7 7 7 7 7 7 7 7 7 ...  $ sevk : int  2 3 3 3 2 2 2 6 2 2 ...  $ iade : int  2 1 0 2 0 2 1 0 0 2 ...  $ satis: int  0 2 3 1 2 0 1 6 2 0 ...

I want to list unique (like SQL's DISTINCT) values for selected multiple variables. For example, unique(yad) gives me the names of each 42 elements, but I need to extract two columns (yad and per together, with all unique combinations):

yad           per ---           --- BARBIE        AYLIK BAKUGAN       2 AYLIK MICKEY MOUSE  2 AYLIK TINKERBELL    3 AYLIK ...           ...

How can I achieve this?

603

asked Oct 17 '11 07:10

Mehper C. Palavuzlar

1 Answers

How about using unique() itself?

df <- data.frame(yad = c("BARBIE", "BARBIE", "BAKUGAN", "BAKUGAN"),                  per = c("AYLIK",  "AYLIK",  "2 AYLIK", "2 AYLIK"),                  hmm = 1:4)  df #       yad     per hmm # 1  BARBIE   AYLIK   1 # 2  BARBIE   AYLIK   2 # 3 BAKUGAN 2 AYLIK   3 # 4 BAKUGAN 2 AYLIK   4  unique(df[c("yad", "per")]) #       yad     per # 1  BARBIE   AYLIK # 3 BAKUGAN 2 AYLIK

153

answered Sep 19 '22 15:09

Josh O'Brien

Related questions
                            
                                Removing NA in dplyr pipe [duplicate]
                            
                                How to parse milliseconds?
                            
                                Is there a built-in way to do a logarithmic color scale in ggplot2?
                            
                                Creating a Prompt/Answer system to input data into R
                            
                                R Apply() function on specific dataframe columns
                            
                                Select random element in a list of R?
                            
                                Select rows from a data frame based on values in a vector
                            
                                Auto-format R code in RStudio
                            
                                What are the differences between community detection algorithms in igraph?
                            
                                How to use the switch statement in R functions?
                            
                                Find duplicated elements with dplyr
                            
                                How to convert a matrix to a list of column-vectors in R?
                            
                                How to get summary statistics by group
                            
                                What is the most useful R trick? [closed]
                            
                                When should I use the := operator in data.table?
                            
                                Filtering out duplicated/non-unique rows in data.table
                            
                                Shiny: what is the difference between observeEvent and eventReactive?
                            
                                Do I need to normalize (or scale) data for randomForest (R package)? [closed]
                            
                                set only lower bound of a limit for ggplot
                            
                                Remove part of string after "."

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With