Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is an efficient way to map unique values of a vector to sequential integers?

Tags:

r

I have a dataframe in R with a vector of non-sequential numbers (data$SiteID) that i would like to map to a vector of sequential numbers (data$site) to the unique values of data$SiteID. Within each site, I would like to map data$TrtID to 0 where data$TrtID == 'control' or to the next sequential integer, for the other unique data$TrtID's:

data <- data.frame(SiteID = c(1,1,1,9,'108','108','15', '15'), 
                   TrtID = c('N', 'control', 'N', 'control', 'P', 'control', 'N', 'P'))
  1. data$site should be c(1,1,1,2,3,3,4,4).
  2. data$trt should be c(1,0,1,0,1,0,0,1).
like image 779
David LeBauer Avatar asked Jan 21 '23 10:01

David LeBauer


1 Answers

Just treat them as factors:

as.numeric(factor(data$SiteID, levels = unique(data$SiteID)))
[1] 1 1 1 2 3 3 4 4

and for the Trt, since you want a 0-based value, subtract one.

as.numeric(factor(data$TrtID, levels = sort(unique(data$TrtID))))-1
[1] 1 0 1 0 2 0 1 2

Notice that the levels arguments are different - Trt sorts first, which is convinient since control is alphabetically before N or P. If you want a non-standard sorting, you can just explicitly specify the levels in the order you want them.

like image 176
Greg Avatar answered Feb 08 '23 10:02

Greg