Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert all numeric columns stored as character to numeric in a dataframe? [duplicate]

Tags:

r

lapply

sapply

I have a dataframe with hundreds of columns where some columns despite having only numeric values are stored as character data type. I need to convert all the columns to numeric where values are numbers only (there might also be NAs in the data).

Example dataframe:

df <- data.frame(id = c("R1","R2","R3","R4","R5"), name=c("A","B","C","D","E"), age=c("24", "NA", "55", "19", "40"), sbp=c(174, 125, 180, NA, 130), dbp=c("106", "67", "109", "NA", "87"))

> print(df, row.names = F)
 id name age sbp dbp
 R1    A  24 174 106
 R2    B  NA 125  67
 R3    C  55 180 109
 R4    D  19  NA  NA
 R5    E  40 130  87

These columns should be numeric.
> df$age
[1] "24" "NA" "55" "19" "40"
> df$dbp
[1] "106" "67"  "109" "NA"  "87" 

I applied as.numeric() function but it also converted all the character varaibles (id, name..etc) to numeric thus the NA generated.

> sapply(df,as.numeric)
     id name age sbp dbp
[1,] NA   NA  24 174 106
[2,] NA   NA  NA 125  67
[3,] NA   NA  55 180 109
[4,] NA   NA  19  NA  NA
[5,] NA   NA  40 130  87

> lapply(df,as.numeric)
$id
[1] NA NA NA NA NA

$name
[1] NA NA NA NA NA

$age
[1] 24 NA 55 19 40

$sbp
[1] 174 125 180  NA 130

$dbp
[1] 106  67 109  NA  87

What I need to do is ignoreing the real character colums (id, names..) while looping through the dataframe. Any help is much appreciated!

like image 770
Mumtaj Ali Avatar asked Dec 29 '25 04:12

Mumtaj Ali


1 Answers

Try type.convert():

df2 <- type.convert(df, as.is = TRUE)

Result:

#> df2
  id name age sbp dbp
1 R1    A  24 174 106
2 R2    B  NA 125  67
3 R3    C  55 180 109
4 R4    D  19  NA  NA
5 R5    E  40 130  87

## check column classes
#> sapply(df2, class)
         id        name         age         sbp         dbp 
"character" "character"   "integer"   "integer"   "integer" 

Note, the as.is argument controls whether character columns are converted to factors. i.e., if as.is= FALSE, the first two columns would have been changed to factors.

like image 129
zephryl Avatar answered Dec 31 '25 17:12

zephryl