Generating Stacked bar plots

Question

I have a dataframe with 3 columns

$x -- at http://pastebin.com/SGrRUJcA
$y -- at http://pastebin.com/fhn7A1rj
$z -- at http://pastebin.com/VmVvdHEE

that I wish to use to generate a stacked barplot. All of these columns hold integer data. The stacked barplot should have the levels along the x-axis and the data for each level along the y-axis. The stacks should then correspond to each of $x, $y and $z.

UPDATE: I now have the following:

counted <- data.frame(table(myDf$x),variable='x')
counted <- rbind(counted,data.frame(table(myDf$y),variable='y'))
counted <- rbind(counted,data.frame(table(myDf$z),variable='z'))
counted <- counted[counted$Var1!=0,]  # to get rid of 0th level??

stackedBp <- ggplot(counted,aes(x=Var1,y=Freq,fill=variable))
stackedBp <-  stackedBp+geom_bar(stat='identity')+scale_x_discrete('Levels')+scale_y_continuous('Frequency')
stackedBp

which generates:

stack plot .

Two issues remain:

the x-axis labeling is not correct. For some reason, it goes: 46, 47, 53, 54, 38, 40.... How can I order it naturally?
I also wish to remove the 0th label.

I've tried using +scale_x_discrete(breaks = 0:50, labels = 1:50) but this doesn't work.

NB. axis labeling issue: Dataframe column appears incorrectly sorted

Justin · Accepted Answer

Not completely sure what you're wanting to see... but reading ?barplot says the first argument, height must be a vector or matrix. So to fix your initial error:

myDf <- data.frame(x=sample(1:10,100,replace=T),y=sample(11:20,100,replace=T),z=1:10)
barplot(as.matrix(myDf))

If you provide a reproducible example and a more specific description of your desired output you can get a better answer.

Or if I were to guess wildly (and use ggplot)...

myDf <- data.frame(x=sample(1:10,100,replace=T),y=sample(11:20,100,replace=T),z=1:10)
myDf.counted<- data.frame(table(myDf$x),variable='x')
myDf.counted <- rbind(myDf.counted,data.frame(table(myDf$y),variable='y'))
myDf.counted <- rbind(myDf.counted,data.frame(table(myDf$z),variable='z'))

ggplot(myDf.counted,aes(x=Var1,y=Freq,fill=variable))+geom_bar(stat='identity')

IRTFM · Answer

I'm surprised that didn't blow up in your face. Cross-classifying the joint occurrence of three different vectors each of length 35204 would often consume many gigabytes of RAM (and would possibly create lots of useless 0's as you found). Maybe you wanted to examine instead the results of sapply(myDf, table)? This then creates three separate tables of counts.

It's a rather irregular result and would need further work to get it into a matrix form but you might want to consider using densityplot to display the comparative distributions which I think is your goal.

$x

   1    2    3    4    5    6    7    8    9   10   11   12   13   14   15   16 
 126  711 1059 2079 3070 2716 2745 3329 2916 2671 2349 2457 2055 1303  892  692 
  17   18   19   20   21   22   23   24   25   26   27   28   29   30   31   32 
 559  799  482  299  289  236  156  145  100   95  121  133   60   34   37   13 
  33   34   35   36   37   38   39   40   41   42   43   44   45   46   47   48 
  15   12   56   10    4    7    2   14   13   28   30   20   16   62   74   58 
  49   50 
  40   15 

$y

   0    1    2    3    4    5    6    7    8    9   10   11   12   13   14   15 
3069   32 1422 1376 1780 1556 1937 1844 1967 1699 1910 1924 1047  894  975  865 
  16   17   18   19   20   21   22   23   24   25   26   27   28   29   30   31 
 635 1002  710  908  979  848  678  908  696  491  417  412  499  411  421  217 
  32   33   34   35   36   37   39   42   46   47   53   54 
 265  182  121   47   38   11    2    2    1    1    1    4 

$z

   0    1    2    3    4    5    6    7    8    9   10   11   12   13   14   15 
  31  202  368  655  825 1246  900 1136 1098 1570 1613 1144 1107 1037 1239 1372 
  16   17   18   19   20   21   22   23   24   25   26   27   28   29   30   31 
1306 1085  843  867  813 1057 1213 1020 1210  939  725  644  617  602  739  584 
  32   33   34   35   36   37   38   39   40   41   42   43 
 650  733  756  681  684  657  544  416  220   48    7    1

The density plot is really simple to create in lattice:

densityplot( ~x+y+z, myDf)

enter image description here

Generating Stacked bar plots

Tags:

plot

r

user1202664

2 Answers

Justin

IRTFM

Recent Activity

Donate For Us

Generating Stacked bar plots

Tags:

plot

r

user1202664

2 Answers

Justin

IRTFM

Related questions

Recent Activity

Donate For Us