In R, almost every is.*
function I can think of has a corresponding as.*
. There is a is.na
but no as.na
. Why not and how would you implement one if such function makes sense?
I have a vector x
that can be logical
, character
, integer
, numeric
or complex
and I want to convert it to a vector of same class and length, but filled with the appropriate: NA
, NA_character_
, NA_integer_
, NA_real_
, or NA_complex_
.
My current version:
as.na <- function(x) {x[] <- NA; x}
To check if the value is NA in R, use the is.na() function. The is.na() is a built-in R function that returns TRUE if it finds NA value and FALSE if it does not find in the dataset. If the value is NA, the is.na() function returns TRUE, otherwise, returns FALSE.
To test if a value is NA, use is.na(). The function is.na(x) returns a logical vector of the same size as x with value TRUE if and only if the corresponding element in x is NA.
A missing value is one whose value is unknown. Missing values are represented in R by the NA symbol. NA is a special value whose properties are different from other values.
So, how do you replace missing values with basic R code? To replace the missing values, you first identify the NA's with the is.na() function and the $-operator. Then, you use the min() function to replace the NA's with the lowest value.
Why not use is.na<-
as directed in ?is.na
?
> l <- list(integer(10), numeric(10), character(10), logical(10), complex(10))
> str(lapply(l, function(x) {is.na(x) <- seq_along(x); x}))
List of 5
$ : int [1:10] NA NA NA NA NA NA NA NA NA NA
$ : num [1:10] NA NA NA NA NA NA NA NA NA NA
$ : chr [1:10] NA NA NA NA ...
$ : logi [1:10] NA NA NA NA NA NA ...
$ : cplx [1:10] NA NA NA ...
This seems to be consistently faster than your function:
as.na <- function(x) {
rep(c(x[0], NA), length(x))
}
(Thanks to Joshua Ulrich for pointing out that my earlier version didn't preserve class attributes.)
Here, for the record, are some relative timings:
library(rbenchmark)
## The functions
flodel <- function(x) {x[] <- NA; x}
joshU <- function(x) {is.na(x) <- seq_along(x); x}
joshO <- function(x) rep(c(x[0], NA), length(x))
## Some vectors to test them on
int <- 1:1e6
char <- rep(letters[1:10], 1e5)
bool <- rep(c(TRUE, FALSE), 5e5)
benchmark(replications=100, order="relative",
flodel_bool = flodel(bool),
flodel_int = flodel(int),
flodel_char = flodel(char),
joshU_bool = joshU(bool),
joshU_int = joshU(int),
joshU_char = joshU(char),
joshO_bool = joshO(bool),
joshO_int = joshO(int),
joshO_char = joshO(char))[1:6]
# test replications elapsed relative user.self sys.self
# 7 joshO_bool 100 0.46 1.000 0.33 0.14
# 8 joshO_int 100 0.49 1.065 0.31 0.18
# 9 joshO_char 100 1.13 2.457 0.97 0.16
# 1 flodel_bool 100 2.31 5.022 2.01 0.30
# 2 flodel_int 100 2.31 5.022 2.00 0.31
# 3 flodel_char 100 2.64 5.739 2.36 0.28
# 4 joshU_bool 100 3.78 8.217 3.13 0.66
# 5 joshU_int 100 3.95 8.587 3.30 0.64
# 6 joshU_char 100 4.22 9.174 3.70 0.51
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With