Suppose I have this data frame:
times vals
1 1 2
2 3 4
3 7 6
set up with
foo <- data.frame(times=c(1,3,7), vals=c(2,4,6))
and I want this one:
times vals
1 1 2
2 2 2
3 3 4
4 4 4
5 5 4
6 6 4
7 7 6
That is, I want to fill in all the times from 1 to 7, and fill in the vals from the latest time that is not greater than the given time.
I have some code to do it using dplyr, but it is ugly. Suggestions for better?
library(dplyr)
foo <- merge(foo, data.frame(times=1:max(foo$times)), all.y=TRUE)
foo2 <- merge(foo, foo, by=c(), suffixes=c('', '.1'))
foo2 <- foo2 %>% filter(is.na(vals) & !is.na(vals.1) & times.1 <= times) %>%
group_by(times) %>% arrange(-times.1) %>% mutate(rn = row_number()) %>%
filter(rn == 1) %>%
mutate(vals = vals.1,
rn = NULL,
vals.1 = NULL,
times.1 = NULL)
foo <- merge(foo, foo2, by=c('times'), all.x=TRUE, suffixes=c('', '.2'))
foo <- mutate(foo,
vals = ifelse(is.na(vals), vals.2, vals),
vals.2 = NULL)
How to fill NA values with previous values in an R data frame column? How to fill NA values with previous values in an R data frame column? To fill NA values with next and previous values, we can use na.locf function of zoo package with fromLast = TRUE.
You can use the following syntax to replace a particular value in a data frame in R with a new value: df [df == 'Old Value'] <- 'New value' You can use the following syntax to replace one of several values in a data frame with a new value: df [df == 'Old Value 1' | df == 'Old Value 2'] <- 'New value'
Cells in dataframe can contain missing values or NA as its elements, and they can be verified using is.na () method in R language. Column values can be subjected to constraints to filter and subset the data. The values can be mapped to specific occurrences or within a range.
In R Programming Language, dataframe columns can be subjected to constraints, and produce smaller subsets. However, while the conditions are applied, the following properties are maintained : Rows are considered to be a subset of the input. Rows in the subset appear in the same order as the original dataframe. Columns remain unmodified.
This is a standard rolling join problem:
library(data.table)
setDT(foo)[.(1:7), on = 'times', roll = T]
# times vals
#1: 1 2
#2: 2 2
#3: 3 4
#4: 4 4
#5: 5 4
#6: 6 4
#7: 7 6
The above is for devel version (1.9.7+), which is smarter about column matching during joins. For 1.9.6 you still need to specify column name for the inner table:
setDT(foo)[.(times = 1:7), on = 'times', roll = T]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With