Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract the factor's values positions in level

Tags:

r

r-factor

I'm returning to R after some time, and the following has me stumped:

I'd like to build a list of the positions factor values have in the facor levels list. Example:

> data = c("a", "b", "a","a","c")
> fdata = factor(data)
> fdata
[1] a b a a c
Levels: a b c
> fdata$lvl_idx <- ????

Such that:

> fdata$lvl_idx
[1] 1 2 1 1 3

Appreciate any hints or tips.

like image 347
Hedgehog Avatar asked Dec 17 '13 01:12

Hedgehog


People also ask

How do I extract factor levels in R?

To extract the factor levels from factor column, we can simply use levels function. For example, if we have a data frame called df that contains a factor column defined with x then the levels of factor levels in x can be extracted by using the command levels(df$x).

How do you identify the levels of a factor?

The number of levels of a factor or independent variable is equal to the number of variations of that factor that were used in the experiment. If an experiment compared the drug dosages 50 mg, 100 mg, and 150 mg, then the factor "drug dosage" would have three levels: 50 mg, 100 mg, and 150 mg.

How do I check the data level in R?

We can check if a variable is a factor or not using class() function. Similarly, levels of a factor can be checked using the levels() function.

What does factor () do in R?

What is Factor in R? Factor in R is a variable used to categorize and store the data, having a limited number of different values. It stores the data as a vector of integer values. Factor in R is also known as a categorical variable that stores both string and integer data values as levels.


1 Answers

If you convert a factor to integer, you get the position in the levels:

as.integer(fdata)
## [1] 1 2 1 1 3

In certain situations, this is counter-intuitive:

f <- factor(2:4)
f
## [1] 2 3 4
## Levels: 2 3 4
as.integer(f)
## [1] 1 2 3

Also if you silently coerce to integer, for example by using a factor as a vector index:

LETTERS[2:4]
## [1] "B" "C" "D"
LETTERS[f]
## [1] "A" "B" "C"

Converting to character before converting to integer gives the expected values. See ?factor for details.

like image 177
Matthew Lundberg Avatar answered Sep 21 '22 21:09

Matthew Lundberg