Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using org-mode to structure an analysis

Tags:

I am trying to make better use of org-mode for my projects. I think literate programming is especially applicable to the realm of data analysis and org-mode lets us do some pretty awesome literate programming.

I think most of you will agree with me that the workflow for writing an analysis is different than most other types of programming. I don't just write a program, I explore the data. And, while many of these explorations are dead-ends, I don't want to delete/ignore them completely. I just don't want to re-run them every time I execute the org file. I also tend to find or develop chunks of useful code that I would like to put into an analytic template, but some of these chunks won't be relevant for every project and I'd like to know how to make org-mode ignore these chunks when I am executing the entire buffer. Here's a simplified example.

* Import   - I want org-mode to ignore import-sql. #+srcname: import-data #+begin_src R :exports none :noweb yes <<import-csv>> #+end_src  #+srcname: import-csv #+begin_src R :exports none data <- read.csv("foo-clean.csv") #+end_src  #+srcname: import-sql #+begin_src R :exports none library(RSQLite) blah blah blah #+end_src  * Clean   - This is run on foo.csv, producing foo-clean.csv   - Fixes the mess of -9 and -13 to NA for my sanity.   - This only needs to be run once, and after that, reference.   - How can I tell org-mode to skip this? #+srcname: clean-csv #+begin_src sh :exports none sed ..... #+end_src  * Explore  ** Explore by a factor (1)    - Dead end. Did not pan out. Ignore.    - Produces a couple of charts showing there is not interaction. #+srcname: explore-by-a-factor-1 #+begin_src R :exports none :noweb yes #+end_src  ** Explore by a factor (2)    - A useful exploration that I will reference later in a report.    - Produces a couple of charts showing the interaction of my variables. #+srcname: explore-by-a-factor-2 #+begin_src R :exports none :noweb yes #+end_src 

I would like to be able to use org-babel-execute-buffer and have org-mode somehow know to skip over the code blocks import-sql, clean-csv and explore-by-a-factor-1. I want them in the org file, because they are relevant to the project. After-all, tomorrow someone might want to know why I was so sure explore-by-a-factor-1 was not useful. I want to keep that code around, so I can bang out the plot or the analysis or what-ever and go on, but not have it run every-time I rerun everything because there's no reason to run it. Ditto with the clean-csv stuff. I want it around, to document what I did to the data (and why), but I don't want to re-run it every time. I'll just import foo-clean.csv.

I Googled all over this and read a bunch of org-mode mailing list archives and I was able to find a couple of ideas, but not what I want. EXPORT_SELECT_TAGS, EXPORT_EXCLUDE_TAGS are great, when exporting the file. And the :tangle header works well, when creating the actual source files. I don't want to do either of these. I just want to execute the buffer. I would like to be able to define code blocks in a similar fashion to be executed or ignored. I guess I would like to find a way to have an org variable such as:

EXECUTE_SELECT_TAGS

This way I could simply tag my various code blocks and be done with it. It would be even nicer if I could then run the file, using only source blocks with specific tags. I can't find a way to do this and I thought I would ask before asking/begging for a new feature in org-mode.

like image 443
Choens Avatar asked Nov 29 '10 15:11

Choens


2 Answers

I figured out. From the org manual:

The :eval header argument can be used to limit the evaluation of specific code blocks. :eval accepts two arguments “never” and “query”. :eval never will ensure that a code block is never evaluated, this can be useful for protecting against the evaluation of dangerous code blocks. :eval query will require a query for every execution of a code block regardless of the value of the org-confirm-babel-evaluate variable.

So you just have to add

:eval never

to the header of the blocks that you don´t want to execute, and voilá!

like image 90
Julian Avatar answered Oct 29 '22 14:10

Julian


While I never did get an answer to my question, the discussion was interesting and apparently an org-mode based Template for R strikes a few people as an interesting idea. I downloaded the source code to org-mode and looked at org-babel-execute-buffer. It is, as I feared, a naive function which does precisely what it says it does and nothing more. It is not (currently) possible to pass it any additional parameters to affect it's behavior. (Unless I am badly misreading the lisp, which is entirely possible.)

Eventually, I decided org-babel-execute-buffer is not necessary for a useful R template system. Babel's noweb functionality is really flexible and I think it is possible to build a workable solution using noweb, rather than trying to develop a complex tagging schema to define how/when to run things.

For tangling/export it should still be possible to use tags to create usable/sane output.

For anyone who is interested: LiterateR

It's probably a little rude to use this thread to put this out there but this is why I asked the question in the first place. TemplateR is my attempt to make R a little easier to use. Right now it is just a template with two simplistic functions. I consider it to be a proof of concept at this point. Eventually, I want to develop something that does more to help people develop R projects more quickly. TemplateR will accomplish this by: 1. Provide a strong structure to develop around. 2. Provide built-in function to provide support for common tasks, especially in the realm of reproducible research. 3. Provide snippets of tested code that can be rapidly re-purposed for the current project.

Right now, all it provides is a basic structure/framework and two simple functions. 1. Identify which R packages are missing (based on what is manually entered into a table) and 2. Creates project directories (plots, data, reports).

More will come in future versions. The README.org and TODO.org go into further detail.

like image 39
Choens Avatar answered Oct 29 '22 14:10

Choens