Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

automated text for reproducible research

I am using RStudio, R Markdown, Latex, and Pandoc to clean data, construct variables, run my analysis, and report the results. I'm new to the concept of reproducible research, but I'm hooked. Makes a lot of sense.

Dynamic tables and figures are no problem. Dynamic text, however, is stumping me. I can insert inline code to say that 95% of all statistics are false, but I am not sure how I can vary my language in a reproducible way.

For instance, what if I have an object x=0.66 and I want to write "2 out of 3 dentists use Crest"? I can look at the current value of x, 0.66, and type "2 out of 3" in the text, but this is not reproducible. Let's say I get new data and rerun my analysis and x becomes 0.52. My text would be out of date. Sure, I could dynamically report that 52% of dentists prefer Crest, but a report gets stale when everything is reported as percentages.

My thought is that I could create functions that I could call in the text when I want to vary the writing. For instance, an "out.of" function could work on if else statements to produce the text:

ifelse(x < 0.09,"fewer than 1 out of 10",
ifelse(x >= 0.09) & x < 0.11,"roughly 1 out of 10",
ifelse(x >= 0.11 & x < 0.15,"slightly more than 1 out of 10",
ifelse(x >= 0.15 & x < 0.19,"nearly 2 out of 5",
ifelse(x >= 0.19 & x < 0.21,"roughly 2 out of 5",
...
ifelse(x >= 0.95 & x < 0.99,"nearly all",
ifelse(x >= 0.99,"all","fubar"))...)

I could also create a fraction function that would do something similar for one-tenth, two-fifths, one-third...

I'm sure others have tackled this issue already. Any leads? Ideas?

like image 889
Eric Green Avatar asked Dec 29 '12 19:12

Eric Green


People also ask

What is meant by reproducible research?

Reproducible research is a by-product of careful attention to detail throughout the research process and allows researchers to ensure that they can repeat the same analysis multiple times with the same results, at any point in that process.

What are reproducible reports?

Reproducible Reports In a reproducible report, narrative and code are written in explicitly linked scripts. Changes made in the narrative or code in one part of the work cascade through the other parts, generating a fully updated version of the report.

What is reproducible data analysis?

Reproducible research is the practice of distributing all data, software source code and tools required to reproduce the results discussed in a research publication. This differs from replication, which is the confirmation of results and conclusions from one study obtained independently in another.

What does it mean if your research is reproducible in R?

A data analysis is reproducible if all the information (data, files, etc.) required is available for someone else to re-do your entire analysis.


1 Answers

There is a package FRACTION and when you replace / by "out of", it could work. However, the output when using the number of decimals is strange:

library(FRACTION)
fra(0.66,j=2)
# [1] "33 / 50"
fra(0.66,j=1)
#"7 / 1e+08" 

Edit by @Dieter Menne: forget this, see @Ben Bolker below.

like image 138
Dieter Menne Avatar answered Oct 16 '22 12:10

Dieter Menne