Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

typeof returns integer for something that is clearly a factor

Tags:

r

Create a variable:

a_variable <- c("a","b","c")

Check type:

typeof(a_variable)

I want a factor - change to factor:

a_variable <- as.factor(a_variable)

Check type:

typeof(a_variable)

Says that it's integer!? As an R newb, this is confusing. I just told R to make a factor not an integer.

Test to see if it somehow magically did create an integer:

a_variable * 1

Hmm... I get an error message saying "*" isn't meaningful for factors. This seems weird to me since R just told me it was an integer!?

Clearly it's me who is confused, can someone more enlightened help make sense of this madness for me?

like image 647
Rocinante Avatar asked Feb 28 '16 22:02

Rocinante


People also ask

How do you check if a variable is a factor?

We can check if a variable is a factor or not using class() function. Similarly, levels of a factor can be checked using the levels() function.

Are factors integers in R?

Factors are stored as integers, and have labels associated with these unique integers. While factors look (and often behave) like character vectors, they are actually integers under the hood, and you need to be careful when treating them like strings.

How do I turn a factor into a numeric?

There are two steps for converting factor to numeric: Step 1: Convert the data vector into a factor. The factor() command is used to create and modify factors in R. Step 2: The factor is converted into a numeric vector using as. numeric().

Which line S of code will correctly convert NUMS from factor to numeric that is which code will return a numeric vector with the values given?

Use as. numeric() to convert a factor to a numeric vector. Note that this will return the numeric codes that correspond to the factor levels.


1 Answers

If what you wanted was to know what class attribute was held by a vector then use class. If you wanted to test whether a vector was a factor then use is.factor.

the value returned by typeof being integer for factors is a language feature that confused me as well in my early days of R programming. The typeof function is giving information that's at a "lower" level of abstraction. Factor variables (and also Dates) are stored as integers. Learn to use class or str rather than typeof (or mode). They give more useful information. You can look at the full "structure" of a factor variable with dput:

 dput( factor( rep( letters[1:5], 2) ) )
# structure(c(1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L), 
            .Label = c("a", "b", "c", "d", "e"), class = "factor")

The character values that are usually thought of as the factor values are actually stored in an attribute (which is what "levels" returns), while the "main" part of the variable is a set of integer indices pointing to teh various level "attributes), named .Label, so mode returns "numeric" and typeof returns "integer". For this reason one usually needs to use as.character that will coerce to what most people think of as factors, namely their character representations.

like image 134
IRTFM Avatar answered Sep 21 '22 17:09

IRTFM