Count the number of transitions

Question

I have a data set like below. Each patient has 3 visits and they can transition between the 3 states from visit to visit.

ID <- c(1,1,1,2,2,2,3,3,3)
Visit <- c(1,2,3,1,2,3,1,2,3)
State <- c(2,1,1,3,2,1,2,3,1)

I want to make a data frame that count the number of transitions of states from visit 1 to visit 2. For Visit 1 to Visit 2, the matrix will be like: (the rows represent the state at visit 1, and the columns represent the state at visit 2. Entries on the diagonals represent counts of participants who did not transition) enter image description here

Zé Loff · Accepted Answer

Although there is no harm in using other packages, this can be easily done using only table on base R (plus a minor step if the data is incomplete).

Preliminary steps

You probably have your data in a data.frame, so we'll build one from your sample data. I'll also make slight adjustments to the variables (IDs as letters, visits as "V1", "V2", etc.), for readability.

ddff <- data.frame(
  ID = rep(c("A", "B", "C"), each = 3),
  Visit = rep(c("V1", "V2", "V3"), 3),
  State = paste0("S", c(2, 1, 1, 3, 2, 1, 2, 3, 1)))

Scenario 1: complete dataset

If the dataset is complete, or if the missing values are explicit (i.e. if there is an explicit entry for each visit of each patient, even if the State is an NA), then it's a simple table is sufficient. We just need to turn State into a factor first, to make sure it isn't dropped, and we need to order the data.frame

ddff$State <- factor(ddff$State)
ddff <- ddff[order(ddff$ID, ddff$Visit), ]

table(ddff$State[ddff$Visit == "V1"],
      ddff$State[ddff$Visit == "V2"],
      dnn = c("V1", "V2"))

    V2
V1   S1 S2 S3
  S1  0  0  0
  S2  1  0  1
  S3  0  1  0

There will be non-zero values in the diagonal if any patients don't change state. E.g. for Visit 3 vs Visit 2:

table(ddff$State[ddff$Visit == "V2"],
      ddff$State[ddff$Visit == "V3"],
      dnn = c("V2", "V3"))

    V3
V2   S1 S2 S3
  S1  1  0  0
  S2  1  0  0
  S3  1  0  0

But if you really don't want them, you easily assign zeros to the diagonal:

tt <- table(ddff$State[ddff$Visit == "V2"],
            ddff$State[ddff$Visit == "V3"],
            dnn = c("V2", "V3"))
diag(tt) <- 0
tt

    V3
V2   S1 S2 S3
  S1  0  0  0
  S2  1  0  0
  S3  1  0  0

Scenario 2: implicit missing data

If there are missing values on the dataset, i.e. if there is not a line for each visit of each patient, the same approach can be used, but we need to fill in the missing data points by joining the data.frame with a combination of all possible IDs and visits.

First we'll drop V2 for patient B, to create an incomplete data.frame:

ddff2 <- ddff[-5, ]
ddff2

  ID Visit State
1  A    V1    S2
2  A    V2    S1
3  A    V3    S1
4  B    V1    S3
5  B    V3    S1
6  C    V1    S2
7  C    V2    S3
8  C    V3    S1

Then we use expand.grid to create a data.frame with all possible combinations of ID and Visit, and then use merge to cross it with our data set. This will turn the implicit missing values into explicit missing values:

ddff2 <- merge(
  ddff2,
  expand.grid(ID = unique(ddff2$ID), Visit = unique(ddff2$Visit)),
  all.y = T)
ddff2

  ID Visit State
1  A    V1    S2
2  A    V2    S1
3  A    V3    S1
4  B    V1    S3
5  B    V2  <NA>
6  B    V3    S1
7  C    V1    S2
8  C    V2    S3
9  C    V3    S1

We can now use the same approach as earlier:

table(ddff2$State[ddff2$Visit == "V1"],
      ddff2$State[ddff2$Visit == "V2"],
      dnn = c("V1", "V2"))

    V2
V1   S1 S2 S3
  S1  0  0  0
  S2  1  0  1
  S3  0  0  0

Count the number of transitions

Tags:

r

Jenn0804

1 Answers

Preliminary steps

Scenario 1: complete dataset

Scenario 2: implicit missing data

Zé Loff

Recent Activity

Donate For Us

Count the number of transitions

Tags:

r

Jenn0804

1 Answers

Preliminary steps

Scenario 1: complete dataset

Scenario 2: implicit missing data

Zé Loff

Related questions

Recent Activity

Donate For Us