Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to avoid "Warning message: NAs introduced by coercion" in as.numeric() [duplicate]

Tags:

r

I have a question on how to avoid NA when using as.numeric function in R. As you can see below I have a character variable (but its values are numeric) of cumulative_viewers, and I wanted to convert it to numeric through as.numeric but it did not work properly. The problem is when the number of digits of the numeric values are beyond four digits the as.numeric function returns NA even though the values are numeric. For example, as.numeric function work well with the value of '999' or '997' BUT when the number of digits are more than four such as '1000' or '1001' or '999999' then the as.numeric function returns NA =.=;;;;;;;;;;;; not its real numeric value....

Could anyone please help me to solve the problem? I sent a day to handle it but could not have an answer yet TT>TT

paste(data_without_duplicates$cumulative_viewers)

    [1] "12,983,336" "12,323,294" "11,375,954" "10,917,221" "10,667,700"
    [6] "10,292,386" "9,350,192"  "9,135,520"  "9,001,309"  "8,653,415" 
    [11] "7,784,755"  "7,508,976"  "7,362,790"  "6,959,047"  "6,706,543" 
    .....
    [1426] "1,026"      "1,024"      "1,023"      "1,020"      "1,017"     
    [1431] "1,016"      "1,013"      "1,011"      "1,001"      "1,000"     
    [1436] "1,000"      "999"        "997"        "994"        "990"       
    [1441] "989"        "988"        "984"        "982"        "979"       
    [1446] "974"        "972"        "971"        "966"        "961"       


as.numeric(data_without_duplicates$cumulative_viewers)

    [1]  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA
    [18]  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA
    [35]  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA
    .......
    [1395]  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA
    [1412]  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA
    [1429]  NA  NA  NA  NA  NA  NA  NA  NA 999 997 994 990 989 988 984 982 979
    [1446] 974 972 971 966 961 959 958 957 950 946 941 930 929 911 911 910 910
    [1463] 910 907 907 902 898 897 895 892 890 890 889 885 885 883 872 871 868
like image 797
Jacob Green Avatar asked Aug 31 '14 15:08

Jacob Green


People also ask

How can we avoid NAs introduced by coercion?

Approach 2: Using the suppressWarnings() function to disable a warning message. You may not always wish to convert non-number values to numbers. In this scenario, just wrap the suppress warnings function around the as. numeric function to disregard the warning message “NAs introduced by coercion”.

What is the meaning of NAs introduced by coercion?

As you can see, the warning message “NAs introduced by coercion” is returned and some output values are NA (i.e. missing data or not available data). The reason for this is that some of the character strings are not properly formatted numbers and hence cannot be converted to the numeric class.

What does coercion mean in R?

When you call a function with an argument of the wrong type, R will try to coerce values to a different type so that the function will work. There are two types of coercion that occur automatically in R: coercion with formal objects and coercion with built-in types.

How do I convert character to numeric in R?

To convert character to numeric in R, use the as. numeric() function. The as. numeric() is a built-in R function that creates or coerces objects of type “numeric”.


2 Answers

It's not really an issue with the number of digits, just the fact that your numbers with four or more digits have commas in them:

N1 <- c("1000", "1,000", "10000", "10,000")
as.numeric(N1)
##
[1]  1000    NA 10000    NA
Warning message:
NAs introduced by coercion
##
> N2 <- gsub(",","",N1)
> as.numeric(N2)
[1]  1000  1000 10000 10000
like image 116
nrussell Avatar answered Sep 19 '22 08:09

nrussell


It looks to me as if the commas in your data are the issue. There are probably dozens of way of dealing with this.

here's one

x <- c("12,983,336", "12,323,294", "11,375,954", "10,917,221", "10,667,700", 
       "10,292,386", "9,350,192", "9,135,520", "9,001,309", "8,653,415", 
       "7,784,755", "7,508,976", "7,362,790", "6,959,047", "6,706,543", 
       "1,026", "1,024", "1,023", "1,020", "1,017", "1,016", "1,013", 
       "1,011", "1,001", "1,000", "1,000", "999", "997", "994", "990", 
       "989", "988", "984", "982", "979", "974", "972", "971", "966", 
       "961")

as.numeric(gsub(",","",x,fixed=TRUE))
like image 27
jalapic Avatar answered Sep 20 '22 08:09

jalapic