Suppose I have dataset which looks like
ID Name
1 JAY
1
1 JAY
2 LAY
2 LAY
2
3 NA
3 KAY
3
I want to fill the rows with missing values(empty or with NA) based on the observation already available in the group. So the resultant data frame will look like
ID Name
1 JAY
1 JAY
1 JAY
2 LAY
2 LAY
2 LAY
3 KAY
3 KAY
3 KAY
I tried using na.locf but it didnt work for non numeric value.
DF1 = setDT(DF)[, N := na.locf(na.locf(Name(NA_real_^!Name),na.rm=FALSE), fromLast=TRUE, na.rm=FALSE), ID][is.na(N), N := 0]
One option is after grouping by 'ID', subset the 'Name' that are not NA and not a blank (nzchar(Name)), get the last observation (tail(...)) and assign (:=) it to 'Name'.
setDT(DF)[, Name := tail(Name[!is.na(Name) & nzchar(Name)], 1), by = ID]
DF
# ID Name
#1: 1 JAY
#2: 1 JAY
#3: 1 JAY
#4: 2 LAY
#5: 2 LAY
#6: 2 LAY
#7: 3 KAY
#8: 3 KAY
#9: 3 KAY
If the 'Name' column is factor change the nzchar(Name) to nzchar(as.character(Name))
Or specify the logical vector in the 'i' and assign (:=) the last observation (Name[.N]) to 'Name' after grouping by 'ID'
setDT(DF)[!is.na(Name) & nzchar(Name), Name := Name[.N], ID]
Note: For the second solution to work, the 'Name' should be character class.
Solution in base R (using split and do.call(bind, ...). Assume d contains your dataframe:
tmp <- lapply(split(d, d$ID), function(x) {
# Explanation:
# decreasing = TRUE so that empty strings are at the end
# na.last = NA so that NA's are omitted
x$Name <- sort(x$Name, decreasing = TRUE, na.last = NA)[1];
return(x);
})
d.new <- do.call(rbind, tmp);
print(d.new);
ID Name
1.1 1 JAY
1.2 1 JAY
1.3 1 JAY
2.4 2 LAY
2.5 2 LAY
2.6 2 LAY
3.7 3 KAY
3.8 3 KAY
3.9 3 KAY
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With