I would like to understand why sum/min/max functions in R interpret a character string as TRUE when supplied to na.rm, while mean()
does not.
My uneducated guess is that as.logical("xyz")
returns NA, which is being supplied to na.rm as the argument, which for some strange reason is accepted as TRUE for sum/min/max while it isn't for mean()
The expected output for sum(c(NA, 4, 5), na.rm = "xyz")
is an argument is not interpretable as logical error (returned from a mean). I don't understand why that isn't the case.
As far as mean
is concerned it is quite straightforward. As @Rich Scriven mentions if you type mean.default
in the console you see a section of code
if (na.rm)
x <- x[!is.na(x)]
which gives you the error.
mean(1:10, na.rm = "abc") #gives
Error in if (na.rm) x <- x[!is.na(x)] : argument is not interpretable as logical
which is similar to doing
if ("abc") "Hello"
Error in if ("abc") "Hello" : argument is not interpretable as logical
Now regarding sum
, min
, max
and other primitive functions which is implemented in C. The source code of these functions is here. There is a parameter Rboolean narm
passed into the function.
The way C treats boolean is different.
#include <stdio.h>
#include <stdbool.h>
int main()
{
bool a = "abc";
if (a)
printf("Hello World");
else
printf("Not Hello World");
return 0;
}
If you run the above C
code it will print "Hello World". Run the demo here. If you pass a string input to boolean type it is considered as TRUE
in C
. In fact that is even true with numbers as well
sum(1:10, na.rm = 12)
works as well.
PS - I am no expert in C and know a little bit of R. Finding all these insights took lot of time. Let me know if I have misinterpreted something and provided any false information.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With