I have a question on how to avoid NA
when using as.numeric
function in R.
As you can see below I have a character variable (but its values are numeric) of cumulative_viewers
,
and I wanted to convert it to numeric through as.numeric
but it did not work properly.
The problem is when the number of digits of the numeric values are beyond four digits the as.numeric
function returns NA
even though the values are numeric.
For example, as.numeric
function work well with the value of '999' or '997' BUT
when the number of digits are more than four such as '1000' or '1001' or '999999' then the as.numeric
function returns NA =.=;;;;;;;;;;;;
not its real numeric value....
Could anyone please help me to solve the problem? I sent a day to handle it but could not have an answer yet TT>TT
paste(data_without_duplicates$cumulative_viewers)
[1] "12,983,336" "12,323,294" "11,375,954" "10,917,221" "10,667,700"
[6] "10,292,386" "9,350,192" "9,135,520" "9,001,309" "8,653,415"
[11] "7,784,755" "7,508,976" "7,362,790" "6,959,047" "6,706,543"
.....
[1426] "1,026" "1,024" "1,023" "1,020" "1,017"
[1431] "1,016" "1,013" "1,011" "1,001" "1,000"
[1436] "1,000" "999" "997" "994" "990"
[1441] "989" "988" "984" "982" "979"
[1446] "974" "972" "971" "966" "961"
as.numeric(data_without_duplicates$cumulative_viewers)
[1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[18] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[35] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
.......
[1395] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[1412] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[1429] NA NA NA NA NA NA NA NA 999 997 994 990 989 988 984 982 979
[1446] 974 972 971 966 961 959 958 957 950 946 941 930 929 911 911 910 910
[1463] 910 907 907 902 898 897 895 892 890 890 889 885 885 883 872 871 868
Approach 2: Using the suppressWarnings() function to disable a warning message. You may not always wish to convert non-number values to numbers. In this scenario, just wrap the suppress warnings function around the as. numeric function to disregard the warning message “NAs introduced by coercion”.
As you can see, the warning message “NAs introduced by coercion” is returned and some output values are NA (i.e. missing data or not available data). The reason for this is that some of the character strings are not properly formatted numbers and hence cannot be converted to the numeric class.
When you call a function with an argument of the wrong type, R will try to coerce values to a different type so that the function will work. There are two types of coercion that occur automatically in R: coercion with formal objects and coercion with built-in types.
To convert character to numeric in R, use the as. numeric() function. The as. numeric() is a built-in R function that creates or coerces objects of type “numeric”.
It's not really an issue with the number of digits, just the fact that your numbers with four or more digits have commas in them:
N1 <- c("1000", "1,000", "10000", "10,000")
as.numeric(N1)
##
[1] 1000 NA 10000 NA
Warning message:
NAs introduced by coercion
##
> N2 <- gsub(",","",N1)
> as.numeric(N2)
[1] 1000 1000 10000 10000
It looks to me as if the commas in your data are the issue. There are probably dozens of way of dealing with this.
here's one
x <- c("12,983,336", "12,323,294", "11,375,954", "10,917,221", "10,667,700",
"10,292,386", "9,350,192", "9,135,520", "9,001,309", "8,653,415",
"7,784,755", "7,508,976", "7,362,790", "6,959,047", "6,706,543",
"1,026", "1,024", "1,023", "1,020", "1,017", "1,016", "1,013",
"1,011", "1,001", "1,000", "1,000", "999", "997", "994", "990",
"989", "988", "984", "982", "979", "974", "972", "971", "966",
"961")
as.numeric(gsub(",","",x,fixed=TRUE))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With