Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Some plots not rendering in Rstudio, knitr, Rmarkdown

I am using: Ubuntu 12.04 64-bit, R 3.0.2, RStudio 0.98.312, knitr 1.5, markdown 0.6.3, mgcv1.7-27

I have an Rmarkdown document with multiple code chunks. In the middle of one chunk there are some bits of code where I fit a GAM, summarise the fit and plot the fit. The problem is that the first plot renders into the output file but the second plot does not. Here is a sanitised code fragment from the chunk:

fit <- gam(y ~ s(x), data=j0, subset= !is.na(x))
summary(fit) # look at non-missing only
plot(fit)

fit <- gam(y ~ s(sqrt(x)), data=j0, subset= !is.na(x))
summary(fit)
plot(fit)

mean(y[is.na(x)]) - mean(y[!is.na(x)])

Everything renders as expected except that the output goes straight from echoing the second plot statement to echoing the following calculation on means. The result of the means calculation is rendered correctly.

If I comment out another plot call 7 lines later in the chunk, then the missing plot is rendered correctly.

Does anyone have any suggestions as to what is happening here?

UPDATE BELOW

Summary - Several lines after the call for Plot 2 there is some R code that generates an execution error (variable not found) and several lines after that there is a call for Plot 3. If the code error is fixed then Plot 2 is rendered. If the code error is unfixed and the call to Plot 3 is commented out, then Plot 2 is rendered. The problem depends on the same variable 'fit' being used to store the results of the different fits. If I assign each fit to a different variable Plot 2 renders OK.

I don't understand how changes made after multiple lines of successfully executed code can (apparently retrospectively) prevent Plot 2 from rendering.

Reproducible example:

Some text.

```{r setup}
require(mgcv)

mkdata <- function(n=100) {
  x <- rnorm(n) + 5
  y <- x + 0.3 * rnorm(n)
  x[sample(ceiling(n/2), ceiling(n/10))] <- NA
  x <- x^2
  data.frame(x, y)  
} 
```

Example 1
=========

Plot 2 fails to render. (Using the same fit object for each fit.)

```{r example_1}
j0 <- mkdata()
attach(j0)
mx <- min(x, na.rm=TRUE)

fit <- gam(y ~ s(x), data=j0, subset= !is.na(x))
summary(fit)
plot(fit) # plot 1

fit <- gam(y ~ s(sqrt(x)), data=j0, subset= !is.na(x))
summary(fit)
plot(fit) #plot 2

mean(y[is.na(x)]) - mean(y[!is.na(x)]) # means calculation

# recode the missing values
j0$x.na <- is.na(x)
j0$x.c <- ifelse(x.na, mx, x) # ERROR in recode
detach()

attach(j0)
fit <- gam(y ~ s(sqrt(x.c)) + x.na, data=j0) # doesn't run because of error in recode
summary(fit) # this is actually fit 2
plot(fit) # plot 3 (this is actually fit 2)
detach()
```

Example 2
=========

Use separate fit objects for each fit. Plot 2 renders OK.

```{r example_2}
j0 <- mkdata()
attach(j0)
mx <- min(x, na.rm=TRUE)

fit1 <- gam(y ~ s(x), data=j0, subset= !is.na(x))
summary(fit1)
plot(fit1) # plot 1

fit2 <- gam(y ~ s(sqrt(x)), data=j0, subset= !is.na(x))
summary(fit2)
plot(fit2) #plot 2

mean(y[is.na(x)]) - mean(y[!is.na(x)]) # means calculation

# recode the missing values
j0$x.na <- is.na(x)
j0$x.c <- ifelse(x.na, mx, x) # ERROR in recode
detach()

attach(j0)
fit3 <- gam(y ~ s(sqrt(x.c)) + x.na, data=j0) # doesn't run because of error in recode
summary(fit3)
plot(fit3) # plot 3
detach()
```

Example 3
=========

Revert to using the same fit object for each fit. Plot 2 renders because plot 3 is commented out.

```{r example_3}
j0 <- mkdata()
attach(j0)
mx <- min(x, na.rm=TRUE)

fit <- gam(y ~ s(x), data=j0, subset= !is.na(x))
summary(fit)
plot(fit) # plot 1

fit <- gam(y ~ s(sqrt(x)), data=j0, subset= !is.na(x))
summary(fit)
plot(fit) #plot 2

mean(y[is.na(x)]) - mean(y[!is.na(x)]) # means calculation

# recode the missing values
j0$x.na <- is.na(x)
j0$x.c <- ifelse(x.na, mx, x) # ERROR in recode
detach()

attach(j0)
fit <- gam(y ~ s(sqrt(x.c)) + x.na, data=j0)
summary(fit) # this is actually fit 2
# plot(fit) # plot 3 (this is actually fit 2)
detach()
```

Example 4
=========

Plot 2 renders because later recode error is fixed.

```{r example_4}
j0 <- mkdata()
attach(j0)
mx <- min(x, na.rm=TRUE)

fit <- gam(y ~ s(x), data=j0, subset= !is.na(x))
summary(fit)
plot(fit) # plot 1

fit <- gam(y ~ s(sqrt(x)), data=j0, subset= !is.na(x))
summary(fit)
plot(fit) #plot 2

mean(y[is.na(x)]) - mean(y[!is.na(x)]) # means calculation

# recode the missing values
j0$x.na <- is.na(x)
j0$x.c <- ifelse(j0$x.na, mx, x) # error in recode fixed
detach()

attach(j0)
fit <- gam(y ~ s(sqrt(x.c)) + x.na, data=j0)
summary(fit)
plot(fit) # plot 3
detach()
```

The log file:

> require(knitr); knit('reproduce.Rmd', encoding='UTF-8');
Loading required package: knitr


processing file: reproduce.Rmd
  |......                                                           |   9%
  ordinary text without R code

  |............                                                     |  18%
label: setup
  |..................                                               |  27%
  ordinary text without R code

  |........................                                         |  36%
label: example_1
  |..............................                                   |  45%
  ordinary text without R code

  |...................................                              |  55%
label: example_2
  |.........................................                        |  64%
  ordinary text without R code

  |...............................................                  |  73%
label: example_3
  |.....................................................            |  82%
  ordinary text without R code

  |...........................................................      |  91%
label: example_4
  |.................................................................| 100%
  ordinary text without R code


output file: reproduce.md

[1] "reproduce.md"
like image 388
Ross Gayler Avatar asked Oct 10 '13 02:10

Ross Gayler


1 Answers

You are just yet another victim of attach(), despite the fact that people have been warning against the use of attach(). It is too easy to screw up with attach(). You did this after you attach(j0):

j0$x.na <- is.na(x)
j0$x.c <- ifelse(x.na, mx, x) # ERROR in recode

Of course, R cannot find the object x.na because it does not exist anywhere. Yes, it is in j0 now, but it will not be exposed to R unless you detach j0 and re-attach it. In other words, attach() does not refresh itself automatically as you add more variables to j0. So the simple fix is:

j0$x.c <- ifelse(j0$x.na, mx, x)

I understand why you want to use attach() -- you can avoid the awkward j0$ prefix everywhere, but you need to be very careful with it. Besides the problem I mentioned, detach() is also bad, because you did not specify which environment to detach, and by default, the second one on the search path is detached, which is not necessarily the one you attached, e.g. you might have loaded other packages onto the search path. Therefore you must be explicit: detach('j0').

Back to knitr: I can explain what is going on if you wish to know, but first of all, you have to make sure your code actually works before passing it to knitr. As the error is eliminated, the odd phenomenon you observed will also go away.

like image 52
Yihui Xie Avatar answered Sep 30 '22 12:09

Yihui Xie