Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does unique() preserve order?

Tags:

r

Imagine we're using the following code:

set.seed(42)
v <- sample(1:10, 100, T)
v <- sort(v)
unique.v <- unique(v)

Can I be sure that unique.v is already sorted?

In a more general setting, is that true that unique() returns a vector, ordered according to the first entry?

The documentation does not imply this, looking to the source with

?unique
getAnywhere('unique.default')

is not of a much help.

Related questions: one, two.

like image 404
tonytonov Avatar asked Nov 28 '13 08:11

tonytonov


People also ask

What does unique () do in pandas?

The unique function in pandas is used to find the unique values from a series. A series is a single column of a data frame. We can use the unique function on any possible set of elements in Python. It can be used on a series of strings, integers, tuples, or mixed elements.

What does unique () mean in Python?

unique() function. The unique() function is used to find the unique elements of an array. Returns the sorted unique elements of an array.

Does pandas unique sort?

Uniques are returned in order of appearance. Hash table-based unique, therefore does NOT sort. The unique values returned as a NumPy array.

Are pandas series ordered?

pandas Series. sort_values() function is used to sort values on Series object. It sorts the series in ascending order or descending order, by default it does in ascending order.


1 Answers

Here's what I found. This guide leads us to names.c, where we see

{"unique",  do_duplicated,  1,  11, 4,  {PP_FUNCALL, PREC_FN,   0}},

After that we move to unique.c and find an entry

SEXP attribute_hidden do_duplicated(SEXP call, SEXP op, SEXP args, SEXP env)

Browsing the code, we stumble upon

dup = duplicated3(x, incomp, fL, nmax);

which is a reference to

static SEXP duplicated3(SEXP x, SEXP incomp, Rboolean from_last, int nmax)

Finally, the main loop here is

for (i = 0; i < n; i++) {
//      if ((i+1) % NINTERRUPT == 0) R_CheckUserInterrupt();
        v[i] = isDuplicated(x, i, &data);
}

So the answer to my question is yes.

like image 139
tonytonov Avatar answered Oct 08 '22 21:10

tonytonov