Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to make a unique set of names from a vector of strings?

Tags:

r

I have a vector of strings. Check out my vector, it's awesome:

> awesome
[1] "a" "b" "c" "d" "d" "e" "f" "f"

I'd like to make a new vector that is the same length as awesome but where, if necessary, the strings have been uniqueified. For example, a valid output of my desired function would be

> awesome.uniqueified
[1] "a" "b" "c" "d.1" "d.2" "e" "f.1" "f.2"

Is there an easy, R-thonic and beautiful way to do this? I should say my list in real life (it's not called awesome) contains 25000ish mircoarray probeset identifiers.

I'm always nervous when I embark on writing little generic functions (which I'm sure I could do) as I'm sure some R guru has come across this problem in the past, nailed it with some incredible algorithm that doesn't even have to store more than half an element in the vector. I'm just not sure what they might have called it. Probably not uniqueify.

like image 939
Mike Dewar Avatar asked Jun 01 '10 20:06

Mike Dewar


People also ask

How do you make strings unique?

unique(c(A, B)) == make. unique(c(make. unique(A), B)) . In other words, you can append one string at a time to a vector, making it unique each time, and get the same result as applying make.

Is a character vector a string?

A character vector is a set of strings stored as the single object.


1 Answers

Try make.unique() where the very first example of the help page is already spot-on:

make.unique(c("a", "a", "a"))
[1] "a"   "a.1"   "a.2"

The help page lists Thomas Minka as author. Buy him a beer one day :)

like image 149
Dirk Eddelbuettel Avatar answered Oct 26 '22 04:10

Dirk Eddelbuettel