Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to fill up rows with first and last day into a sequence of days?

Tags:

dataframe

r

If I have this table:

a <- as.data.frame(matrix(c("1", "1", "1", "1", "2", "2", "A first day", "A last day", "B first day", "B last day", "A first day", "A last day", 3, 5, 10, 14, 2, 5), ncol = 3))
colnames(a) = c("Patient", "Treatment", "Day")

Which looks like this:

Patient Treatment Day
1 A first day 3
1 A last day 5
1 B first day 10
1 B last day 14
2 A first day 2
2 A last day 5

How can I transform this so that each day from the first to the last gets its own row, so that the table looks like this?

Patient Treatment Day
1 A 3
1 A 4
1 A 5
1 B 10
1 B 11
1 B 12
1 B 13
1 B 14
2 A 2
2 A 3
2 A 4
2 A 5

Thank you!

like image 618
Wandering_geek Avatar asked Oct 11 '25 12:10

Wandering_geek


2 Answers

We need to create unique treatment group per patient, therefore we need to get rid of the "first day" and "last day" string in the column. Then group_by the Patient and Treatment columns to create unique group, and use reframe to generate a new sequence of Day based on the min and max of Day.

library(tidyverse)

a %>% 
  mutate(Treatment = sub(" first day| last day", "", Treatment),
         Day = as.integer(Day)) %>% 
  group_by(Patient, Treatment) %>% 
  reframe(Day = min(Day):max(Day))

# A tibble: 12 × 3
   Patient Treatment   Day
   <chr>   <chr>     <int>
 1 1       A             3
 2 1       A             4
 3 1       A             5
 4 1       B            10
 5 1       B            11
 6 1       B            12
 7 1       B            13
 8 1       B            14
 9 2       A             2
10 2       A             3
11 2       A             4
12 2       A             5
like image 109
benson23 Avatar answered Oct 14 '25 07:10

benson23


First use sub to get rid of the extra text. Then split by the ID variables which are 1:2 in your case and cbind 1st row of the data frame minus the day variable -3, and use `:` in a do.call to get the sequences, finally rbind everything. I added some extra columns to demonstrate they are preserved.

a$Treatment <- sub('\\s.*', '', a$Treatment)
by(a, a[1:2], \(x) cbind(x[1, -3], Day=do.call(`:`, as.list(as.integer(sort(x[[3]])))), row.names=NULL)) |> 
  do.call(what=rbind)
#    Patient Treatment Sex   X Day
# 1        1         A   m 0.5   3
# 2        1         A   m 0.5   4
# 3        1         A   m 0.5   5
# 4        2         A   f 0.4   2
# 5        2         A   f 0.4   3
# 6        2         A   f 0.4   4
# 7        2         A   f 0.4   5
# 8        1         B   f 0.3  10
# 9        1         B   f 0.3  11
# 10       1         B   f 0.3  12
# 11       1         B   f 0.3  13
# 12       1         B   f 0.3  14

Data:

a <- structure(list(Patient = c(1L, 1L, 1L, 1L, 2L, 2L), Treatment = c("A first day", 
"A last day", "B first day", "B last day", "A first day", "A last day"
), Day = c(3L, 5L, 10L, 14L, 2L, 5L), Sex = c("m", "m", "f", 
"f", "f", "f"), X = c(0.5, 0.5, 0.3, 0.3, 0.4, 0.4)), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5", "6"))
like image 41
jay.sf Avatar answered Oct 14 '25 07:10

jay.sf



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!