Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sorting a boxplot based on median value

Tags:

r

boxplot

I'd like to use R to make a series of boxplots which are sorted by median value. Suppose then I execute:

boxplot(cost ~ type) 

This would give me some boxplots were cost is shown on the y axis and the type category is visible on the x-axis:

-----     -----   |         |  [ ]        |   |        [ ]   |         | -----     -----   A         B 

However, what I'd like is the boxplot figures sorted from highest to lowest median value. My suspicion is that what I need to do is change the labels of the type (A or B) to numerically indicate which is the lowest and highest median value, but I wonder if there is a more clever way to solve the problem.

like image 740
speciousfool Avatar asked Sep 22 '10 02:09

speciousfool


People also ask

What does the median tell you in a Boxplot?

The median (middle quartile) marks the mid-point of the data and is shown by the line that divides the box into two parts. Half the scores are greater than or equal to this value and half are less. The middle “box” represents the middle 50% of scores for the group.

What does a higher median in a box plot mean?

Positively Skewed : For a distribution that is positively skewed, the box plot will show the median closer to the lower or bottom quartile. A distribution is considered "Positively Skewed" when mean > median. It means the data constitute higher frequency of high valued scores.

How do you know if a box plot is greater than the median?

The mean will be about the same as the median, and the box plot will look symmetric. If the distribution is skewed to the right most values are 'small', but there are a few exceptionally large ones. Those exceptional values will impact the mean and pull it to the right, so that the mean will be greater than the median.


1 Answers

Check out ?reorder. The example seems to be what you want, but sorted in the opposite order. I changed -count in the first line below to sort in the order you want.

  bymedian <- with(InsectSprays, reorder(spray, -count, median))   boxplot(count ~ bymedian, data = InsectSprays,           xlab = "Type of spray", ylab = "Insect count",           main = "InsectSprays data", varwidth = TRUE,           col = "lightgray") 
like image 84
Joshua Ulrich Avatar answered Sep 19 '22 14:09

Joshua Ulrich