Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert factor to numeric in R without NAs introduced by coercion warning message

Tags:

r

I have a data which contains factor class , so while converting it to numeric , i'm getting this warning message . following code i've written in R to convert factor into numeric

class(usedcars$Price)
[1] "factor"

e <- paste(usedcars$Price)
e <- as.numeric(paste(usedcars$Price))
Warning message:
NAs introduced by coercion 

Guys all the data is converted into "NA" but class is numeric. Could anyone help me out to get rid of this NA warning message while converting a factor to numeric in R?

like image 302
sam Avatar asked Oct 01 '13 12:10

sam


4 Answers

This happens when you use as.numeric on non-numeric variables.

my guess is that your numbers have "," in them (for example 1,285) so first make your factors "clean" with db <- gsub(",","",db) and then run as.numeric(db)

like image 142
Sandler Avatar answered Oct 17 '22 08:10

Sandler


You could try retype from the hablar package. If the problem is commas instead of dots, it replaces them with dots. Example:

library(hablar)
library(dplyr)

df <- tibble(a = as.factor(c("1,56", "5,87")))

df %>% retype()

gives you:

# A tibble: 2 x 1
      a
  <dbl>
1  1.56
2  5.87
like image 24
davsjob Avatar answered Oct 17 '22 08:10

davsjob


I know this was asked a long time ago but since it doesn't have an accepted answer I would like to add this:

e <- as.numeric(as.factor(usedcars$Price))

When paste is being used, it is essentially converting the price into character and then to numeric and it doesn't work mostly because of the properties of a dataframe.

like image 24
Leocode Avatar answered Oct 17 '22 07:10

Leocode


I'll try to replicate your problem:

set.seed(1)
a <- factor(sample(1:100, 10))
> a
 [1] 27 37 57 89 20 86 97 62 58 6 
Levels: 6 20 27 37 57 58 62 86 89 97

The alexwhan comment is fine actually:

> as.numeric(as.character(a))
 [1] 27 37 57 89 20 86 97 62 58  6

Even if your data needs to be trim()ed it would work anyway:

> paste( " ", a, " ")
 [1] "  27  " "  37  " "  57  " "  89  " "  20  " "  86  " "  97  " "  62  " "  58  " "  6  " 
> as.numeric(paste( " ", a, " "))
 [1] 27 37 57 89 20 86 97 62 58  6

SO the only explanation is you have some (unexpected) character in all your numbers

> as.numeric(paste(a, "a"))
 [1] NA NA NA NA NA NA NA NA NA NA
Warning message:
NAs introduced by coercion 

If you can't see any letter the following happened to me:

> paste( intToUtf8(160), a, intToUtf8(160))
 [1] "  27  " "  37  " "  57  " "  89  " "  20  " "  86  " "  97  " "  62  " "  58  " "  6  " 
> as.numeric(paste( intToUtf8(160), a, intToUtf8(160)))
 [1] NA NA NA NA NA NA NA NA NA NA

intToUtf8(32) is the usual white space from the keyboard (like above some lines) but the number 160 is something that looks similar what is another different thing, which as.numeric (and also trim from gdata) doesn't recognise and returns NA.

like image 28
Michele Avatar answered Oct 17 '22 08:10

Michele