I have been trying the find the difference between ordered
and unordered factor
variables in R. Especially this line in the documentation of ?factor
is confusing me:
Ordered factors differ from factors only in their class, but methods and
the model-fitting functions treat the two classes quite differently.
The closest I have come to finding the answer is from the answers of these three questions:
In an answer to above 1st question, @joran has said that "A detailed summary of the statistical differences is probably way beyond the scope of a StackOverflow answer."
I'm not looking for a detailed summary here, but can anyone give a small and simple example demonstrating how ordered
and unordered factor
differs when used in methods and model-fitting functions?
Ordered factors use orthogonal polynomial contrasts by default. The L and Q stand for the linear and quadratic terms. Unordered factors use "treatment" contrasts although (they're actually not contrasts).
R – Level Ordering of Factors Factors are data objects used to categorize data and store it as levels. They can store a string as well as an integer. They represent columns as they have a limited number of unique values. Factors in R can be created using factor() function.
type will be converted from a character to a factor. The main difference is that factors have predefined levels. Thus their value can only be one of those levels or NA. Whereas characters can be anything.
Check if a Factor is an Ordered Factor in R Programming – is. ordered() Function. is. ordered() function in R Programming Language is used to check if the passed factor is an ordered factor.
Ordered factors use orthogonal polynomial contrasts by default. The L and Q stand for the linear and quadratic terms. Unordered factors use "treatment" contrasts although (they're actually not contrasts).
for understanding read: http://r.789695.n4.nabble.com/Models-with-ordered-and-unordered-factors-td4072225.html http://www.stat.berkeley.edu/~s133/factors.html
The major difference that is the most easily apparent is "pretty printing." Ordered factors print well, in console, and they determine order of labels in ggplots.
In terms of modelling, contrasts generated for them in fitting linear models are different. If you are looking for some simple examples that describe the material, I would suggest you look at http://www.ats.ucla.edu/stat/r/library/contrast_coding.htm. Two points in this article give examples of the two schemes: 1. Dummy Coding - Unordered R factors 4. Orthogonal Polynomial Coding - Ordered R factors.
To summarise, dummy coding uses simple comparison of levels to a reference level in fitting models (e.g. gender, race, etc.); whereas polynomial coding uses trend analysis (for a variable such as income or education).
The examples in the above link are in R, so would serve to illustrate your query well.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With