I am trying to renumber groups of people.
In the data, 'FamID' indicates a family, 'PtID' indicates an individual patient that relates to the family. The 'Twin' column indicates whether the patients are identical twins (coded as 1), non-identical twins (coded as 2) or not twins (coded as 0).
FamID PtID Twin
F1 F11 1
F1 F12 1
F2 F21 2
F2 F22 2
F3 F31 1
F3 F32 1
F4 F41 2
F5 F51 1
F5 F52 1
F5 F53 0
F6 F61 1
F6 F62 1
F7 F71 2
F7 F72 2
So for example, 'FamID' F1 has two family members, PtID F11 and F12, who are identical twins (Twin = 1).
I want to create a column (NewCol) that has a coding based on the Twin column and the FamID column.
The first set of identical twins in the Twin column (coded as 1) would have a 1 in the new column and the second set of identical twins from a different family would be 3, where the following set of identical twins would be the next odd number and so on.
For the non-identical twins (coded as 2s) they would go up incrementally in even numbers with the first family of non-identical twins starting at 2 and going up.
Any non twins (coded as 0s), they would remain 0.
Desired output:
FamID PtID Twin NewCol
F1 F11 1 1
F1 F12 1 1
F2 F21 2 2
F2 F22 2 2
F3 F31 1 3
F3 F32 1 3
F4 F41 2 4
F5 F51 1 5
F5 F52 1 5
F5 F53 0 0
F6 F61 1 7
F6 F62 1 7
F7 F71 2 6
F7 F72 2 6
Data
FamID <- c(rep("F1", 2), rep("F2", 2), rep("F3", 2), "F4", rep("F5", 3), rep("F6", 2), rep("F7", 2))
PtID <- c("F11", "F12", "F21", "F22", "F31", "F32", "F41", "F51", "F52", "F53", "F61", "F62", "F71", "F72")
Twin <- c(1, 1, 2, 2, 1, 1, 2, 1, 1, 0, 1, 1, 2, 2)
sample <- data.frame(FamID, PtID, Twin)
Here's a solution using the data.table
package:
dt <- data.table(sample)
dt[Twin == 0, NewCol := 0L]
dt[Twin == 1, NewCol := .GRP * 2L - 1L, by = FamID]
dt[Twin == 2, NewCol := .GRP * 2L, by = FamID]
The result is
# FamID PtID Twin NewCol
# 1: F1 F11 1 1
# 2: F1 F12 1 1
# 3: F2 F21 2 2
# 4: F2 F22 2 2
# 5: F3 F31 1 3
# 6: F3 F32 1 3
# 7: F4 F41 2 4
# 8: F5 F51 1 5
# 9: F5 F52 1 5
# 10: F5 F53 0 0
# 11: F6 F61 1 7
# 12: F6 F62 1 7
# 13: F7 F71 2 6
# 14: F7 F72 2 6
Data.tables have several benefits (intuitive syntax, efficiency in many operations) and behave exactly like data.frames when used with most functions. However, you can convert back to a data.frame using
df <- as.data.frame(dt)
Using factor
s and data.table
library(data.table)
DT.Sample <- data.table(sample)
DT.Sample[ , NewCol := 0]
DT.Sample[Twin==1 , NewCol:= 2*as.numeric(factor(FamID))-1]
DT.Sample[Twin==2 , NewCol:= 2*as.numeric(factor(FamID))]
FamID PtID Twin NewCol
1: F1 F11 1 1
2: F1 F12 1 1
3: F2 F21 2 2
4: F2 F22 2 2
5: F3 F31 1 3
6: F3 F32 1 3
7: F4 F41 2 4
8: F5 F51 1 5
9: F5 F52 1 5
10: F5 F53 0 0
11: F6 F61 1 7
12: F6 F62 1 7
13: F7 F71 2 6
14: F7 F72 2 6
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With