Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using R, Randomly Assigning Students Into Groups Of 4

Tags:

r

sample

I'm still learning R and have been given the task of grouping a long list of students into groups of four based on another variable. I have loaded the data into R as a data frame. How do I sample entire rows without replacement, one from each of 4 levels of a variable and have R output the data into a spreadsheet?

So far I have been tinkering with a for loop and the sample function but I'm quickly getting over my head. Any suggestions? Here is sample of what I'm attempting to do. Given:

Last.Name <- c("Picard","Troi","Riker","La Forge", "Yar", "Crusher", "Crusher", "Data")
First.Name <- c("Jean-Luc", "Deanna", "William", "Geordi", "Tasha", "Beverly", "Wesley", "Data")
Email <- c("[email protected]","[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]", "[email protected]")
Section <- c(1,1,2,2,3,3,4,4)

df <- data.frame(Last.Name,First.Name,Email,Section)

I want to randomly select a Star Trek character from each section and end up with 2 groups of 4. I would want the entire row's worth of information to make it over to a new data frame containing all groups with their corresponding group number.

like image 451
JForsythe Avatar asked Jan 15 '15 00:01

JForsythe


3 Answers

I'd use the wonderful package 'dplyr'

require(dplyr)

random_4 <- df %>% group_by(Section) %>% slice(sample(c(1,2),1))

random_4
Source: local data frame [4 x 4]
Groups: Section

  Last.Name First.Name   Email Section
1      Troi     Deanna [email protected]       1
2  La Forge     Geordi [email protected]       2
3   Crusher    Beverly [email protected]       3
4      Data       Data [email protected]       4

random_4
Source: local data frame [4 x 4]
Groups: Section

  Last.Name First.Name   Email Section
1    Picard   Jean-Luc [email protected]       1
2     Riker    William [email protected]       2
3   Crusher    Beverly [email protected]       3
4      Data       Data [email protected]       4

%>% means 'and then'

The code is read as:

Take DF AND THEN for all 'Section', select by position (slice) 1 or 2. Voila.

like image 177
col. slade Avatar answered Sep 18 '22 16:09

col. slade


I suppose you have 8 students: First.Name <- c("Jean-Luc", "Deanna", "William", "Geordi", "Tasha", "Beverly", "Wesley", "Data").

If you wish to randomly assign a section number to the 8 students, and assuming you would like each section to have 2 students, then you can either permute Section <- c(1, 1, 2, 2, 3, 3, 4, 4) or permute the list of the students.

First approach, permute the sections:

> assigned_section <- print(sample(Section))
[1] 1 4 3 2 2 3 4 1

Then the following data frame gives the assignments:

assigned_students <- data.frame(First.Name, assigned_section)

Second approach, permute the students:

> assigned_students <- print(sample(First.Name))
[1] "Data"     "Geordi"   "Tasha"    "William"  "Deanna"   "Beverly"  "Jean-Luc" "Wesley"  

Then, the following data frame gives the assignments:

assigned_students <- data.frame(assigned_students, Section)
like image 43
Alex Avatar answered Sep 18 '22 16:09

Alex


Alex, Thank You. Your answer wasn't exactly what I was looking for, but it inspired the correct one for me. I had been thinking about the process from a far too complicated point of view. Instead of having R select rows and put them into a new data frame, I decided to have R assign a random number to each of the students and then sort the data frame by the number:

First, I broke up the data frame into sections:

df1<- subset(df, Section ==1)

df2<- subset(df, Section ==2)

df3<- subset(df, Section ==3)

df4<- subset(df, Section ==4)

Then I randomly generated a group number 1 through 4.

Groupnumber <-sample(1:4,4, replace=F)

Next, I told R to bind the columns:

Assigned1 <- cbind(df1,Groupnumber)

*Ran the group number generator and cbind in alternating order until I got through the whole set. (Wanted to make sure the order of the numbers was unique for each section).

Finally row binding the data set back together:

Final_List<-rbind(Assigned1,Assigned2,Assigned3,Assigned4)

Thank you everyone who looked this over. I am new to data science, R, and stackoverflow, but as I learn more I hope to return the favor.

like image 40
JForsythe Avatar answered Sep 21 '22 16:09

JForsythe



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!