Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get switch() to handle NA?

Tags:

r

sapply

Okay, I have to recode a df, because I want factors as integers:

library(dplyr)

load(url('http://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/crash2.rda'))

df <- crash2 %>% select(source, sex)

df$source <- sapply(df$source, switch, "telephone" = 1, "telephone entered manually" = 2, "electronic CRF by email" = 3, "paper CRF enteredd in electronic CRF" = 4, "electronic CRF" = 5, NA)

This works as intended, but there are NAs in the next variable (sex) and things get complicated:

df$sex <- sapply(df$sex, switch, "male" = 1, "female" = 2, NA)

returns a list with NAs switched to oblivion. Use unlist() returns a vector that is too short for the df.

length(unlist(sapply(df$sex, switch, "male" = 1, "female" = 2, NA)))

should be 20207, but is 20206.

What I want is a vector matching the df by returning NAs as NAs.

Besides a working solution I would be extra thankful for an explanation where I went wrong and how the code actually works.

Edit: Thank you for all your answers. As is so often the case, there is an even more efficent solution I should have noticed myself (well, I noticed it by myself, but too late, obviously):

>str(df$sex)
Factor w/ 2 levels "male","female": 1 2 1 1 2 1 1 1 1 1 ...

So I can just use as.numeric() to get what I want.

like image 763
Markus Avatar asked Jan 25 '23 17:01

Markus


1 Answers

You may use `NA`.

x
# [1] "a" "e" "a" "a" NA  "d" "b" "b" NA  "d"
unname(sapply(x, switch, "a"=1, "b"=2, "c"=3, "d"=4, "e"=5, `NA`=NA))
# [1]  1  5  1  1 NA  4  2  2 NA  4

Data:

x <- c("a", "e", "a", "a", NA, "d", "b", "b", NA, "d")
like image 63
jay.sf Avatar answered Jan 30 '23 02:01

jay.sf