Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between ordered and unordered factor variables in R

Tags:

r

I have been trying the find the difference between ordered and unordered factor variables in R. Especially this line in the documentation of ?factor is confusing me:

Ordered factors differ from factors only in their class, but methods and
the model-fitting functions treat the two classes quite differently.

The closest I have come to finding the answer is from the answers of these three questions:

  1. Factors ordered vs. levels
  2. Is there an advantage to ordering a categorical variable?
  3. factor() command in R is for categorical variables with hierarchy level only?

In an answer to above 1st question, @joran has said that "A detailed summary of the statistical differences is probably way beyond the scope of a StackOverflow answer."

I'm not looking for a detailed summary here, but can anyone give a small and simple example demonstrating how ordered and unordered factor differs when used in methods and model-fitting functions?

like image 922
StrikeR Avatar asked Sep 30 '14 11:09

StrikeR


People also ask

What is the difference between ordered and unordered variables?

Ordered factors use orthogonal polynomial contrasts by default. The L and Q stand for the linear and quadratic terms. Unordered factors use "treatment" contrasts although (they're actually not contrasts).

What is an ordered factor variable in R?

R – Level Ordering of Factors Factors are data objects used to categorize data and store it as levels. They can store a string as well as an integer. They represent columns as they have a limited number of unique values. Factors in R can be created using factor() function.

What is the difference between variables of type factor and character in R?

type will be converted from a character to a factor. The main difference is that factors have predefined levels. Thus their value can only be one of those levels or NA. Whereas characters can be anything.

How do I check if a variable is ordered in R?

Check if a Factor is an Ordered Factor in R Programming – is. ordered() Function. is. ordered() function in R Programming Language is used to check if the passed factor is an ordered factor.


Video Answer


2 Answers

Ordered factors use orthogonal polynomial contrasts by default. The L and Q stand for the linear and quadratic terms. Unordered factors use "treatment" contrasts although (they're actually not contrasts).

for understanding read: http://r.789695.n4.nabble.com/Models-with-ordered-and-unordered-factors-td4072225.html http://www.stat.berkeley.edu/~s133/factors.html

like image 77
Suchit kumar Avatar answered Sep 18 '22 22:09

Suchit kumar


The major difference that is the most easily apparent is "pretty printing." Ordered factors print well, in console, and they determine order of labels in ggplots.

In terms of modelling, contrasts generated for them in fitting linear models are different. If you are looking for some simple examples that describe the material, I would suggest you look at http://www.ats.ucla.edu/stat/r/library/contrast_coding.htm. Two points in this article give examples of the two schemes: 1. Dummy Coding - Unordered R factors 4. Orthogonal Polynomial Coding - Ordered R factors.

To summarise, dummy coding uses simple comparison of levels to a reference level in fitting models (e.g. gender, race, etc.); whereas polynomial coding uses trend analysis (for a variable such as income or education).

The examples in the above link are in R, so would serve to illustrate your query well.

like image 38
Ankur Kanoria Avatar answered Sep 17 '22 22:09

Ankur Kanoria