Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

list all factor levels of a data.frame

with str(data) I get the headof the levels (1-2 values)

fac1: Factor w/ 2  levels ... : fac2: Factor w/ 5  levels ... : fac3: Factor w/ 20 levels ... : val: num ... 

with dplyr::glimpse(data) I get more values, but no infos about number/values of factor-levels. Is there an automatic way to get all level informations of all factor vars in a data.frame? A short form with more info for

levels(data$fac1) levels(data$fac2) levels(data$fac3) 

or more precisely a elegant version for something like

for (n in names(data))   if (is.factor(data[[n]])) {     print(n)     print(levels(data[[n]]))   } 

thx Christof

like image 431
ckluss Avatar asked Dec 28 '14 12:12

ckluss


People also ask

How do you check factor levels in R?

Get the Number of Levels of a Factor in R Programming – nlevels() Function. nlevels() function in R Language is used to get the number of levels of a factor.

What are the factors of level?

Factor levels are all of the values that the factor can take (recall that a categorical variable has a set number of groups). In a designed experiment, the treatments represent each combination of factor levels. If there is only one factor with k levels, then there would be k treatments.

How do you find the level of a factor?

To extract the factor levels from factor column, we can simply use levels function. For example, if we have a data frame called df that contains a factor column defined with x then the levels of factor levels in x can be extracted by using the command levels(df$x).

What is a factor in a data frame in R?

Factors are the data objects which are used to categorize the data and store it as levels. They can store both strings and integers. They are useful in the columns which have a limited number of unique values.


2 Answers

Here are some options. We loop through the 'data' with sapply and get the levels of each column (assuming that all the columns are factor class)

sapply(data, levels) 

Or if we need to pipe (%>%) it, this can be done as

library(dplyr) data %>%       sapply(levels) 

Or another option is summarise_each from dplyr where we specify the levels within the funs.

 data %>%       summarise_each(funs(list(levels(.)))) 
like image 117
akrun Avatar answered Sep 20 '22 13:09

akrun


If your problem is specifically to output a list of all levels for a factor, then I have found a simple solution using :

unique(df$x)

For instance, for the infamous iris dataset:

unique(iris$Species)

like image 45
Djamil Lakhdar-Hamina Avatar answered Sep 20 '22 13:09

Djamil Lakhdar-Hamina