Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reordering factor gives different results, depending on which packages are loaded

I wanted to create a barplot in which the bars were ordered by height rather than alphabetically by category. This worked fine when the only package I loaded was ggplot2. However, when I loaded a few more packages and ran the same code that created, sorted, and plotted my data frame, the bars had reverted to being sorted alphabetically again.

I checked the data frame each time using str() and it turned out that the attributes of the data frame were now different, even though I'd run the same code each time.

My code and output are listed below. Can anyone explain the differing behavior? Why does loading a few apparently unrelated packages (unrelated in the sense that none of the functions I'm using seem to be masked by the newly loaded packages) change the result of running the transform() function?

Case 1: Just ggplot2 loaded

library(ggplot2)

group = c("C","F","D","B","A","E")
num = c(12,11,7,7,2,1)
data = data.frame(group,num)
data1 = transform(data, group=reorder(group,-num))

> str(data1)
'data.frame':   6 obs. of  2 variables:
 $ group: Factor w/ 6 levels "C","F","B","D",..: 1 2 4 3 5 6
  ..- attr(*, "scores")= num [1:6(1d)] -2 -7 -12 -7 -1 -11
  .. ..- attr(*, "dimnames")=List of 1
  .. .. ..$ : chr  "A" "B" "C" "D" ...
 $ num  : num  12 11 7 7 2 1

Case 2: Load several more packages, then run the same code again

library(plyr)
library(xtable)
library(Hmisc)
library(gmodels)
library(reshape2)
library(vcd)
library(lattice)

group = c("C","F","D","B","A","E")
num = c(12,11,7,7,2,1)
data = data.frame(group,num)
data1 = transform(data, group=reorder(group,-num))

> str(data1)
'data.frame':   6 obs. of  2 variables:
 $ group: Factor w/ 6 levels "A","B","C","D",..: 3 6 4 2 1 5
 $ num  : num  12 11 7 7 2 1

UPDATE: SessionInfo()

Case 1: Ran sessionInfo() after loading ggplot2

> sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
  [1] C/en_US.UTF-8/C/C/C/C

attached base packages:
  [1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
  [1] ggplot2_0.9.1

loaded via a namespace (and not attached):
  [1] MASS_7.3-18        RColorBrewer_1.0-5 colorspace_1.1-1   dichromat_1.2-4    digest_0.5.2       grid_2.15.0       
[7] labeling_0.1       memoise_0.1        munsell_0.3        plyr_1.7.1         proto_0.3-9.2      reshape2_1.2.1    
[13] scales_0.2.1       stringr_0.6        tools_2.15.0

Case 2: Ran sessionInfo() after loading the additional packages

> sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
  [1] C/en_US.UTF-8/C/C/C/C

attached base packages:
  [1] grid      splines   stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
  [1] lattice_0.20-6   vcd_1.2-13       colorspace_1.1-1 MASS_7.3-18      reshape2_1.2.1   gmodels_2.15.2  
[7] Hmisc_3.9-3      survival_2.36-14 xtable_1.7-0     plyr_1.7.1       ggplot2_0.9.1   

loaded via a namespace (and not attached):
  [1] RColorBrewer_1.0-5 cluster_1.14.2     dichromat_1.2-4    digest_0.5.2       gdata_2.8.2        gtools_2.6.2      
[7] labeling_0.1       memoise_0.1        munsell_0.3        proto_0.3-9.2      scales_0.2.1       stringr_0.6       
[13] tools_2.15.0
like image 719
eipi10 Avatar asked Jun 07 '12 20:06

eipi10


1 Answers

This happens because:

  1. gmodels imports gdata
  2. gdata creates a new method for reorder.factor

Start a clean session. Then:

methods("reorder")
[1] reorder.default*    reorder.dendrogram*

Now load gdata (or load gmodels, which has the same effect):

library(gdata)
methods("reorder")
[1] reorder.default*    reorder.dendrogram* reorder.factor 

Notice there is no masking, since reorder.factor doesn't exist in base

Recreate the problem, but this time explicitly call the different packages:

group = c("C","F","D","B","A","E")
num = c(12,11,7,7,2,1)
data = data.frame(group,num)

The base R version (using reorder.default):

str(transform(data, group=stats:::reorder.default(group,-num)))
'data.frame':   6 obs. of  2 variables:
 $ group: Factor w/ 6 levels "C","F","B","D",..: 1 2 4 3 5 6
  ..- attr(*, "scores")= num [1:6(1d)] -2 -7 -12 -7 -1 -11
  .. ..- attr(*, "dimnames")=List of 1
  .. .. ..$ : chr  "A" "B" "C" "D" ...
 $ num  : num  12 11 7 7 2 1

The gdata version (using reorder.factor):

str(transform(data, group=gdata:::reorder.factor(group,-num)))
'data.frame':   6 obs. of  2 variables:
 $ group: Factor w/ 6 levels "A","B","C","D",..: 3 6 4 2 1 5
 $ num  : num  12 11 7 7 2 1
like image 100
Andrie Avatar answered Nov 16 '22 01:11

Andrie