Okay, I have to recode a df, because I want factors as integers:
library(dplyr)
load(url('http://biostat.mc.vanderbilt.edu/wiki/pub/Main/DataSets/crash2.rda'))
df <- crash2 %>% select(source, sex)
df$source <- sapply(df$source, switch, "telephone" = 1, "telephone entered manually" = 2, "electronic CRF by email" = 3, "paper CRF enteredd in electronic CRF" = 4, "electronic CRF" = 5, NA)
This works as intended, but there are NAs in the next variable (sex) and things get complicated:
df$sex <- sapply(df$sex, switch, "male" = 1, "female" = 2, NA)
returns a list with NAs switched to oblivion. Use unlist()
returns a vector that is too short for the df.
length(unlist(sapply(df$sex, switch, "male" = 1, "female" = 2, NA)))
should be 20207
, but is 20206
.
What I want is a vector matching the df by returning NAs as NAs.
Besides a working solution I would be extra thankful for an explanation where I went wrong and how the code actually works.
Edit: Thank you for all your answers. As is so often the case, there is an even more efficent solution I should have noticed myself (well, I noticed it by myself, but too late, obviously):
>str(df$sex)
Factor w/ 2 levels "male","female": 1 2 1 1 2 1 1 1 1 1 ...
So I can just use as.numeric()
to get what I want.
You may use `NA`
.
x
# [1] "a" "e" "a" "a" NA "d" "b" "b" NA "d"
unname(sapply(x, switch, "a"=1, "b"=2, "c"=3, "d"=4, "e"=5, `NA`=NA))
# [1] 1 5 1 1 NA 4 2 2 NA 4
Data:
x <- c("a", "e", "a", "a", NA, "d", "b", "b", NA, "d")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With