I have a large data frame that has three identifiers. For example:
df <- data.frame(year=c(1999,1999,2000,2000,2000), country=c('K','K','M','M','S'),
site=c('di','se','di','di','di'))
Which will produce a data frame like this:
year country site
1999 K di
1999 K se
2000 M di
2000 M di
2000 S di
I want to add an additional column to the data frame and have a 'unique id' assigned by using the entries for 'year', 'country', and 'site'. It would look something like this:
year country site unique_id
1999 K di 1
1999 K se 2
2000 M di 3
2000 M di 3
2000 S di 4
Any suggestions on how to do this would be greatly appreciated. I'm thinking it could somehow be done using the plyr package?
This should work quite nicely. (It takes advantage of the fact that unique levels of a factor are each actually stored as integers, and uses as.numeric()
to access/extract those integer values).
df$unique_id <-
as.numeric(as.factor(with(df, paste(year, country, site, sep="_"))))
df
# year country site unique_id
# 1 1999 K di 1
# 2 1999 K se 2
# 3 2000 M di 3
# 4 2000 M di 3
# 5 2000 S di 4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With