I created a data frame with random values
n <- 50
df <- data.frame(id = seq (1:n),
age = sample(c(20:90), n, rep = TRUE),
sex = sample(c("m", "f"), n, rep = TRUE, prob = c(0.55, 0.45))
)
and would like to introduce a few NA
values to simulate real world data. I am trying to use apply
but cannot get there. The line
apply(subset(df,select=-id), 2, function(x) {x[sample(c(1:n),floor(n/10))]})
will retrieve random values alright, but
apply(subset(df,select=-id), 2, function(x) {x[sample(c(1:n),floor(n/10))]<-NA})
will not set them to NA
. Have tried with
and within
, too.
Brute force works:
for (i in (1:floor(n/10))) {
df[sample(c(1:n), 1), sample(c(2:ncol(df)), 1)] <- NA
}
But I'd prefer to use the apply
family.
Return x
within your function:
> df <- apply (df, 2, function(x) {x[sample( c(1:n), floor(n/10))] <- NA; x} )
> tail(df)
id age sex
[45,] "45" "41" NA
[46,] "46" NA "f"
[47,] "47" "38" "f"
[48,] "48" "32" "f"
[49,] "49" "53" NA
[50,] "50" "74" "f"
Using dplyr
1 you could arrive at the desired solution using the following, compact, syntax:
set.seed(123)
library("tidyverse")
n <- 50
df <- data.frame(
id = seq (1:n),
age = sample(c(20:90), n, replace = TRUE),
sex = sample(c("m", "f"), n, replace = TRUE, prob = c(0.55, 0.45))
)
mutate(.data = as_tibble(df),
across(
.cols = all_of(c("age", "sex")),
.fns = ~ ifelse(row_number(.x) %in% sample(1:n(), size = (10 * n(
) / 100)), NA, .x)
))
Approximatly 10% of values is replaced with NA per column. This follows from sample(1:n(), size = (10 * n() / 100))
count(.Last.value, sex)
# A tibble: 3 x 2
# sex n
# <chr> <int>
# 1 f 21
# 2 m 24
# 3 NA 5
# A tibble: 50 x 3
# id age sex
# <int> <int> <chr>
# 1 1 50 m
# 2 2 70 m
1 I'm loading tidyverse
as replace_na
is available via tidyr
.
Apply returns an array, thereby converting all columns to the same type. You could use this instead:
df[,-1] <- do.call(cbind.data.frame,
lapply(df[,-1], function(x) {
x[sample(c(1:n),floor(n/10))]<-NA
x
})
)
Or use a for
loop:
for (i in seq_along(df[,-1])+1) {
is.na(df[sample(seq_len(n), floor(n/10)),i]) <- TRUE
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With