Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create unique random group id in R [duplicate]

Tags:

dataframe

r

dplyr

I am trying to create a unique, randomly assigned (without replacement) group id without using a for loop. This is as far as I got:

library(datasets)
library(dplyr)

data(iris)

iris <- iris  %>% group_by(Species) %>% mutate(id = cur_group_id())

This gives me a group id for each iris$Species, however, I would like the group id to randomly assigned from c(1,2,3) as opposed to assigned based on the order of the dataset.

Any help creating this would be very helpful! I am sure there is a way to do this with dplyr but I am stumped...

like image 465
ManyGiraffes Avatar asked Jul 29 '20 22:07

ManyGiraffes


2 Answers

Maybe you can play some tricks on group_by by adding sample operation, e.g.,

iris <- iris %>%
  group_by(factor(Species, levels = sample(levels(Species)))) %>%
  mutate(id = cur_group_id())
like image 197
ThomasIsCoding Avatar answered Sep 21 '22 10:09

ThomasIsCoding


Here's a sample answer creating a random number and ranking them.

library(datasets)
library(dplyr)

data(iris)

df <- iris %>% 
  group_by(Species) %>%
  mutate(id = runif(1,0,1)) %>% 
  ungroup() %>% 
  mutate(id = dense_rank(id))

df %>% sample_n(10)
#> # A tibble: 10 x 6
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width Species       id
#>           <dbl>       <dbl>        <dbl>       <dbl> <fct>      <int>
#>  1          4.4         3            1.3         0.2 setosa         3
#>  2          6.5         3            5.5         1.8 virginica      2
#>  3          6.3         2.7          4.9         1.8 virginica      2
#>  4          5           3.6          1.4         0.2 setosa         3
#>  5          6.3         2.3          4.4         1.3 versicolor     1
#>  6          7.9         3.8          6.4         2   virginica      2
#>  7          5.4         3.9          1.7         0.4 setosa         3
#>  8          5.7         4.4          1.5         0.4 setosa         3
#>  9          6.4         2.8          5.6         2.2 virginica      2
#> 10          5.2         3.4          1.4         0.2 setosa         3

Created on 2020-07-29 by the reprex package (v0.3.0)

like image 27
Ryan John Avatar answered Sep 20 '22 10:09

Ryan John