Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

RMarkdown: knitr::purl() on Python code chunk?

I want to export my Python code chunk in RMarkdown to an external file. knitr::purl() achieves this, but I am only able to make it work on R code chunks. Does it not work for any other language than R?

For example, from below, export the python code into a my_script.py file.

---
title: "Untitled"
output: html_document
---

## Header

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod 
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, 
quis nostrud exercitation ullamco laboris nisi ut aliquip

```{python}
x = 10
y = 20

z = x + y
print(z)
```
like image 732
Shantanu Avatar asked Jan 27 '23 17:01

Shantanu


1 Answers

Currently purl outputs non-R code commented out. So we need to redefine output function to override this.

Here is a simple script that (1) outputs python code only, and (2) strips documentation (I took the function from knitr source and hacked it):

library("knitr")

# New processing functions
process_tangle <- function (x) { 
    UseMethod("process_tangle", x)
}

process_tangle.block <- function (x) {
    params = opts_chunk$merge(x$params)

    # Suppress any code but python
    if (params$engine != 'python') {
        params$purl <- FALSE
    }
    if (isFALSE(params$purl)) 
        return("")
    label = params$label
    ev = params$eval
    code = if (!isFALSE(ev) && !is.null(params$child)) {
        cmds = lapply(sc_split(params$child), knit_child)
        one_string(unlist(cmds))
    }
    else knit_code$get(label)
    if (!isFALSE(ev) && length(code) && any(grepl("read_chunk\\(.+\\)", 
        code))) {
        eval(parse_only(unlist(stringr::str_extract_all(code, 
            "read_chunk\\(([^)]+)\\)"))))
    }
    code = knitr:::parse_chunk(code)
    if (isFALSE(ev)) 
        code = knitr:::comment_out(code, params$comment, newline = FALSE)
    # Output only the code, no documentation
    return(knitr:::one_string(code))
}

# Reassign functions
assignInNamespace("process_tangle.block",
                  process_tangle.block,
                  ns="knitr")

# Purl
purl("tmp.Rmd", output="tmp.py")

Here is my tmp.Rmd file. Note that it has an R chunk, which I do not want in the result:

---
title: "Untitled"
output: html_document
---

## Header

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod 
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, 
quis nostrud exercitation ullamco laboris nisi ut aliquip

```{python}
#!/usr/bin/env python
# A python script
```

```{python} 
x = 10
y = 20

z = x + y
print(z)
```

```{r}
y=5
y
```

Running Rscript extract.R I get tmp.py:

#!/usr/bin/env python
# A python script

x = 10
y = 20

z = x + y
print(z)

PS I found this question searching for the solution to the same problem. Since nobody answered it, I developed my own solution :)

like image 78
Boris Avatar answered Jan 29 '23 05:01

Boris