Does R `unique` always return values in same order?

Question

Stupid example:

df <- data.frame(group=rep(LETTERS, each=2), value=1:52)
res <- unlist(lapply(unique(df$group), function(x) mean(subset(df, group==x)$value)))
names(res) <- unique(df$group)

Will res always be?

   A    B    C    D    E    F    G    H    I    J    K    L    M    N    O    P 
 1.5  3.5  5.5  7.5  9.5 11.5 13.5 15.5 17.5 19.5 21.5 23.5 25.5 27.5 29.5 31.5 
   Q    R    S    T    U    V    W    X    Y    Z 
33.5 35.5 37.5 39.5 41.5 43.5 45.5 47.5 49.5 51.5

Or will it ever happen that the means calculated on line 2 won't match up to the names on line 3? I guess it depends on the underlying implementation of unique in the R base, but I'm not sure where to find that.

Ben Bolker · Accepted Answer

According to ?unique:

‘unique’ returns a vector, data frame or array like ‘x’ but with duplicate elements/rows removed.

This description gives you a complete description of the ordering -- it will be in the same order as the order of the first unique elements. (I guess I don't see the wiggle room that @joran sees for a different ordering.) For example,

unique(c("B","B","A","C","C","C","B","A"))

will result in

[1] "B" "A" "C"

I believe unique(x) will in general be identical to (but more efficient than)

x[!duplicated(x)]

If you want to look at the internal code, see here: the moving parts are something like

k = 0;
switch (TYPEOF(x)) {
case LGLSXP:
case INTSXP:
for (i = 0; i < n; i++)
    if (LOGICAL(dup)[i] == 0)
    INTEGER(ans)[k++] = INTEGER(x)[i];
break;

i.e., the internal representation is exactly what I said, that it goes through the vector sequentially and fills in non-duplicated elements. Since ordering isn't explicitly guaranteed in the documentation it is theoretically possible that this implementation could change in the future, but it is almost vanishingly unlikely.

For what you're trying to do there are simpler R idioms

df <- data.frame(group=rep(LETTERS, each=2), value=1:52)
a1 <- aggregate(df$value,list(df$group),mean)

This returns a two-column data frame, so you can use

setNames(a1[,2],a1[,1])

to convert it to your format. Or

library(plyr)
unlist(daply(df,"group",summarise,val=mean(value)))

Andrew · Answer

R will return a sorted vector if unique is called on a RasterLayer object.

example <- raster(xmn = 0, xmx = 100, ymn = 0, ymx = 100, nrow = 100, ncol = 100)
example[] <- sample(x <- 1:100, 10000, replace = TRUE)

plot(example)

vals <- values(example)[x]
identical(vals, x)

uniques <- unique(example)
identical(uniques, x)

The values should (very likely) not be identical to the ordered vector, but unique values will always be identical to the ordered vector.

Otherwise, the previous answers are correct that R will return a vector of the order that the non-duplicates appeared.

Does R `unique` always return values in same order?

Tags:

r

unique

fanli

2 Answers

Ben Bolker

Andrew

Recent Activity

Donate For Us

Does R `unique` always return values in same order?

Tags:

r

unique

fanli

2 Answers

Ben Bolker

Andrew

Related questions

Recent Activity

Donate For Us