Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use R to Randomly Assign of Participants to Treatments on a Daily Basis

Tags:

The Problem:

I am attempting to use R to generate a random study design where half of the participants are randomly assigned to "Treatement 1" and the other half are assigned to "Treatment 2". However, because half of the subjects are male and half are female and I also want to ensure that an equal number of males and females are exposed to each treatment, half of the males and females should be assigned to "Treatment 1" and the remaining half should be assigned to "Treatment 2".

There are two complications to this design: (1) This is a yearlong study and the assignment of participants to treatment must occur on a daily basis; and (2) Each participant must be exposed to "Treatment 1" a minimum 10 times in a 28 day period.

Is this even possible to automate this in the R interface? I assume so, but I think my beginner status as an R programmer prohibits me from finding the solution on my own. I have been struggling for days to figure out how to actualize this, and have looked through many similar-sounding posts on this site that were not able to be successfully applied here. I am hoping someone out there knows some tricks that could help me get unstuck in solving this problem, any advice would be greatly appreciated!

What I Have Tried:

Specific Information

# There are 16 participants
p <- c("P01", "P02", "P03", "P04", "P05", "P06", "P07", "P08", "P09", "P10", "P11", "P12", "P13", "P14", "P15", "P16")

# Half are male and half are female
g <- c(rep("M", 8), rep("F", 8))

# I make a dataframe but this may not be necessary
df <- cbind.data.frame(p,g)

# There are 365 days in one year
d <- seq(1,365,1)

...unfortunately, I am not sure how to proceed from here.

Ideal Outcome:

I am envisioning something approximate to this table as the outcome: I do not have enough reputation points to embed images yet so here is the link, sorry!

Basically there is a column for each participant and a row for each day. Associated with each day is an assignment to either Treatment 1 (T1) or Treatment 2 (T2), with 4 of the 8 males and 4 of the 8 females being assigned to T1 and the remainder to T2. These treatments are reassigned every day for 1 year. Not depicted in this chart is the need for each participant to be exposed to T1 at least 10 times in a 28-day period. The table does not have to look like that if something else makes more sense!

like image 696
Wu Wei Avatar asked May 30 '20 22:05

Wu Wei


People also ask

How can you randomly assign participants?

How do you randomly assign participants to groups? To implement random assignment, assign a unique number to every member of your study's sample. Then, you can use a random number generator or a lottery method to randomly assign each number to a control or experimental group.

Why is it important to randomly assign participants to groups?

Random assignment helps ensure that members of each group in the experiment are the same, which means that the groups are also likely more representative of what is present in the larger population.

Why randomly assign treatments?

Randomization as a method of experimental control has been extensively used in human clinical trials and other biological experiments. It prevents the selection bias and insures against the accidental bias. It produces the comparable groups and eliminates the source of bias in treatment assignments.

Why do we use random assignment to treatment groups?

Random assignment of participants helps to ensure that any differences between and within the groups are not systematic at the outset of the experiment. Thus, any differences between groups recorded at the end of the experiment can be more confidently attributed to the experimental procedures or treatment.


1 Answers

Consider splitting data frame by day and gender with by, then run enough samples with replicate at 100 times to pick one of several where treatments are balanced:

Data

df <- merge(data.frame(participant = p, gender = g), 
            data.frame(days = seq(1,365)), 
            by=NULL)

Solution

df_list <- by(df, list(df$gender, df$days), function(sub){
  t <- replicate(100, {                                        # RUN 100 REPETITIONS OF EXPRESSION
    s <- sample(c("T1", "T2"), size=nrow(sub), replace=TRUE)   # SAMPLE "T1" AND "T2" BY SIZE OF SUBSET
    s[ sum(s == "T1") == sum(s == "T2") ]                      # FILTER TO EQUAL TREATMENTS 
  })

  t <- Filter(length, t)[[1]]             # SELECT FIRST OF SEVERAL NON-EMPTY RETURNS
  transform(sub, treatment = t)           # ASSIGN RESULT TO NEW COLUMN
})

# BIND DATA FRAMES AND RESET ROW.NAMES
final_df <- data.frame(do.call(rbind.data.frame, df_list), row.names=NULL)

Output

Day 1

head(final_df, 16)

#    participant gender days treatment
# 1          P09      F    1        T1
# 2          P10      F    1        T2
# 3          P11      F    1        T2
# 4          P12      F    1        T1
# 5          P13      F    1        T2
# 6          P14      F    1        T2
# 7          P15      F    1        T1
# 8          P16      F    1        T1
# 9          P01      M    1        T1
# 10         P02      M    1        T1
# 11         P03      M    1        T2
# 12         P04      M    1        T2
# 13         P05      M    1        T2
# 14         P06      M    1        T1
# 15         P07      M    1        T1
# 16         P08      M    1        T2

Day 365

tail(final_df, 16)

#      participant gender days treatment
# 5825         P09      F  365        T2
# 5826         P10      F  365        T2
# 5827         P11      F  365        T1
# 5828         P12      F  365        T2
# 5829         P13      F  365        T1
# 5830         P14      F  365        T2
# 5831         P15      F  365        T1
# 5832         P16      F  365        T1
# 5833         P01      M  365        T1
# 5834         P02      M  365        T2
# 5835         P03      M  365        T1
# 5836         P04      M  365        T2
# 5837         P05      M  365        T2
# 5838         P06      M  365        T2
# 5839         P07      M  365        T1
# 5840         P08      M  365        T1

Ideally, for analytical purposes you should keep data in long format (i.e., tidy data). But if needing wide format consider reshape with helper and cleanup processing:

# HELPER OBJECTS
final_df$participant_gender <- with(final_df, paste0(participant, gender))
new_names <- paste0(p, g)

# RESHAPE WIDE
wide_df <- reshape(final_df, v.names = "treatment", timevar = "participant_gender", 
                   idvar="days", drop = c("gender", "participant"), 
                   new.row.names = 1:365, direction = "wide")

# RENAME AND RE-ORDER COLUMNS
names(wide_df) <- gsub("treatment.", "", names(wide_df))
wide_df <- wide_df[c("days", new_names)]

head(wide_df)
#   days P01M P02M P03M P04M P05M P06M P07M P08M P09F P10F P11F P12F P13F P14F P15F P16F
# 1    1   T1   T1   T2   T2   T2   T1   T1   T2   T1   T2   T2   T1   T2   T2   T1   T1
# 2    2   T1   T1   T2   T1   T2   T1   T2   T2   T1   T2   T2   T1   T2   T2   T1   T1
# 3    3   T1   T1   T2   T1   T1   T2   T2   T2   T1   T2   T2   T2   T1   T2   T1   T1
# 4    4   T1   T1   T1   T2   T2   T2   T1   T2   T2   T1   T1   T2   T2   T1   T1   T2
# 5    5   T1   T1   T2   T1   T2   T2   T1   T2   T1   T1   T2   T1   T2   T2   T1   T2
# 6    6   T2   T1   T1   T1   T2   T2   T1   T2   T2   T2   T2   T1   T2   T1   T1   T1
like image 193
Parfait Avatar answered Oct 06 '22 09:10

Parfait