Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

2 factor histogram analysis

Tags:

r

ggplot2

I've looked for a long time for an answer to this problem and I haven't been able to find an answer.

Here is the problem: I have a data frame with the following variables: flow rate 1 (CH_SONAR), flow rate 2 (CH_SONAR_2T), density (CH_DENSITY), and the percent difference between the two flow rates (per_diff). I've created a 5 level factor for flow rate 1 and another 5 level factor for density.

f.factor <- cut(p.pipeline$CH_SONAR_2T, 5, labels = c('Very Low','Low', 'Medium', 'High', 'Very High'))

d.factor <- cut(p.pipeline$CH_DENSITY, 5, labels = c('Water', 'Very Sparce', 'Sparce', 'Dense', 'Very Dense'))

I've plotted both using ggplot2 using each factor as the fill variable:

qplot(per_diff, data = p.pipeline, geom = "histogram", binwidth = 1, xlim = c(-5, 15), fill = f.factor)

qplot(per_diff, data = p.pipeline, geom = "histogram", binwidth = 1, xlim = c(-5, 15), fill = d.factor)

Now I would like to create a histogram with ggplot that lets me see the relationship between flow rate and density (Water and Very Low, Very Sparce and Low, Sparce and Low, etc. for all 25 possible combinations). I've tried creating new factors, binding d.factor and f.factor to the data frame, binding the two factors together etc. and no results, do you guys have any idea how to approach this?

I've tried including the histograms I produced but I don't think I have enough reputation to do it.

Thanks for all your help!

like image 934
americo Avatar asked Apr 02 '13 20:04

americo


1 Answers

You can use fill=interaction(f.factor, d.factor). Combinations that don't appear in the legend, such as 'Low.Very Sparce' indicate that there is not an observation belonging to both of these categories.

enter image description here

If you want the colors of adjacent levels to standout more, one thing you can do is generate the colors with rainbow, then swap every other color with it's opposite on the wheel.

col <- rainbow(length(levels(interaction(f.factor, d.factor))), v=.75, s=.5)
col.index <- ifelse(seq(col) %% 2, 
                    seq(col), 
                    (seq(ceiling(length(col)/2), length.out=length(col)) %% length(col)) + 1)
mixed <- col[col.index]
qplot(per_diff, data = p.pipeline, 
      geom = "histogram", binwidth = 1, xlim = c(-5, 15), 
      fill = interaction(f.factor, d.factor)) + scale_fill_manual(values=mixed)

enter image description here

like image 148
Matthew Plourde Avatar answered Oct 09 '22 02:10

Matthew Plourde