Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I turn the numeric output of boxplot (with plot=FALSE) into something usable?

Tags:

r

boxplot

I'm successfully using the boxplot function to generate... boxplots. Now I need to generate tables containing the stats that boxplot calculates in order to create plots.

I do this by using the plot=FALSE option.

The problem is that this produces data in a rather bizarre format that I simply can't do anything with. Here's an example:

structure(list(stats = structure(c(178.998262143545, 182.227431564442, 
202.108456373209, 220.375358994654, 221.990406228232, 216.59986775699, 
217.054997032148, 228.509462713206, 267.070720949859, 284.832378859975, 
189.864120937198, 201.876421960518, 219.525439081472, 234.260088973545, 
279.343359793024, 209.472617639903, 209.526516071858, 214.785213079737, 
230.027361556731, 240.0647114578, 202.057148813419, 207.375619207685, 
220.093663781351, 226.246698737471, 240.343646265795), .Dim = c(5L, 
5L)), n = c(4, 6, 8, 4, 8), conf = structure(c(171.971593703341, 
232.245319043076, 196.247705331772, 260.771220094641, 201.435457751239, 
237.615420411705, 198.589545146688, 230.980881012787, 209.552007821332, 
230.635319741371), .Dim = c(2L, 5L)), out = numeric(0), group = numeric(0), 
names = c("U", "UM", "M", "LM", "L")), .Names = c("stats", "n", "conf", "out", "group", 
"names"))

What I want is a table for each of the stats -- min, max, median and the quartiles -- and their values for each group (the ones in "names").

Could somebody give me a hand with this? I'm very much an R beginner.

Thanks in advance!

like image 534
Gil Williams Avatar asked Jan 13 '12 01:01

Gil Williams


People also ask

How do you read a box and whisker plot with outliers?

When reviewing a box plot, an outlier is defined as a data point that is located outside the whiskers of the box plot. For example, outside 1.5 times the interquartile range above the upper quartile and below the lower quartile (Q1 - 1.5 * IQR or Q3 + 1.5 * IQR).

Which argument of Boxplot () is used to create a filled Boxplot?

boxplot . Some of the frequently used ones are, main -to give the title, xlab and ylab -to provide labels for the axes, col to define color etc. Additionally, with the argument horizontal = TRUE we can plot it horizontally and with notch = TRUE we can add a notch to the box.

How do you read a box plot graph?

How to Read a Box Plot. A boxplot is a way to show a five number summary in a chart. The main part of the chart (the “box”) shows where the middle portion of the data is: the interquartile range. At the ends of the box, you” find the first quartile (the 25% mark) and the third quartile (the 75% mark).

How do you read a Boxplot in Python?

How to interpret the box plot? The bottom of the (green) box is the 25% percentile and the top is the 75% percentile value of the data. So, essentially the box represents the middle 50% of all the datapoints which represents the core region when the data is situated.


1 Answers

boxplot returns a structure in R called a list.

A list is more-or-less a data container where you can refer to elements by name. If you do A <- boxplot(...), you can access the names with A$names, the conf with A$conf, etc.

So, looking at the boxplot helpfile ?boxplot under Value: (which tells you what boxplot returns), we see that it returns a list with the following components:

   stats: a matrix, each column contains the extreme of the lower
          whisker, the lower hinge, the median, the upper hinge and the
          extreme of the upper whisker for one group/plot.  If all the
          inputs have the same class attribute, so will this component.
       n: a vector with the number of observations in each group.    
    conf: a matrix where each column contains the lower and upper
          extremes of the notch.    
     out: the values of any data points which lie beyond the extremes
          of the whiskers.    
   group: a vector of the same length as ‘out’ whose elements indicate
          to which group the outlier belongs.    
   names: a vector of names for the groups.

So the table for each of the stats is in A$stats, each column belongs to a group and contains the min, lower quartile, median, upper quartile, and max.

You could do:

A <- boxplot(...)
mytable <- A$stats
colnames(mytable)<-A$names
rownames(mytable)<-c('min','lower quartile','median','upper quartile','max')
mytable 

which returns (for mytable):

                      U       UM        M       LM        L
min            178.9983 216.5999 189.8641 209.4726 202.0571
lower quartile 182.2274 217.0550 201.8764 209.5265 207.3756
median         202.1085 228.5095 219.5254 214.7852 220.0937
upper quartile 220.3754 267.0707 234.2601 230.0274 226.2467
max            221.9904 284.8324 279.3434 240.0647 240.3436

Then you can refer to it like mytable['min','U'].

like image 51
mathematical.coffee Avatar answered Oct 23 '22 10:10

mathematical.coffee