Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C for R programmers - recommended resources/approaches once past the basics [closed]

Tags:

c

r

I would like to improve my C skills in order to be more competent at converting R code to C where this would be useful. What hints do people have that will help me on my way?

Background: I followed an online Intro to C course a few years ago and that plus Writing R Extensions and S Programming (Venables & Ripley) enabled me to convert bottleneck operations to C, e.g. computing the product of submatrices (did I re-invent the wheel there?). However I would like to go a bit beyond this, e.g. converting larger chunks of code, making use of linear algebra routines etc.

No doubt I have more to learn from the resources I used before, but I wondered if there were others that people recommend? Working through examples is obviously one way to learn more: Brian Ripley gave a couple of examples of moving from S prototypes to S+C in this workshop on Efficient Programming in S and a more recent Bioconductor workshop Advanced R for Bioinformatics (sorry can't post hyperlink) includes a lab on writing an R+C algorithm. More like this, or other suggestions would be appreciated.

like image 383
Heather Turner Avatar asked Sep 15 '09 15:09

Heather Turner


People also ask

What is R programming basics?

R is a programming language and software environment for statistical analysis, graphics representation and reporting. R was created by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand, and is currently developed by the R Development Core Team.

Does R programming have a future?

R programmers can find jobs in various companies and firms in roles such as: Data Scientist: A data scientist's job entails collecting data, transforming it into the desired format, analyzing it and then drawing insights from it.

Can I learn R without programming background?

R Programming A-Z It requires no prior experience to take this course, and it is designed for all skill levels, and can also be taken by people with no programming or statistical background.


3 Answers

That is a very interesting question. As it happens, I had learned C and C++ before moving to R so that may have made it "easier" for me to add C/C++ to R.

But even with that, I would be among the first to say that adding pure C to R is hellishly complicated because of the different macros and R-internals at the C level that you need to learn.

Which leads me to my favorite argument: Use an additional abstraction layer such as the Rcpp package. It hides a lot of the nasty details. And I hope that you don't need to know a lot of C++ to make use of it. One example of a package using it is the small earthmovdist package on R-Forge which uses Rcpp wrapper classes to interface one particular metric.

Edit 1: For example, see the main function of earthmovdist here which should hopefully be easy enough to read, possibly with the (short) Rcpp wrapper classes package manual at one's side.

Edit 2: Three quick reasons why I consider C++ to be more appropriate and R-alike:

  • using Rcpp wrapper classes means you never have to use PROTECT and UNPROTECT, which is a frequent source of error and heap corruption if not mapped

  • using Rcpp and with STL container classes like vector etc means you never have to explicitly call malloc() / free() or new / deletewhich removes another frequent source of error.

  • Rcpp allows you to wrap everything in try / catch blocks at the C++ level and reports the exception back to R --- so no sudden seg.faults and program deaths.

That said, choice of language is a very personal decision, and many users are of course perfectly happy with the lower-level interface between C and R.

like image 162
Dirk Eddelbuettel Avatar answered Sep 30 '22 20:09

Dirk Eddelbuettel


I have struggled with this issue as well.

If the issue is to improve command of C, there are plenty of book lists on the subject. They all start with K&R. I enjoyed "Expert C Programming" by P. van der Linden and "C primer" by S. Prata. Any reference on the C standard library works.

If the issue is to interface C to R, other then the aforementioned official R document, you can check out this Harvard course, and this quick start guide. I have only passed scalar and arrays to C, and honestly wouldn't know how to interface complex data structures.

If the issue is to interface C++ to R, or build C++ skills, I can't really answer as I don't use much C++. A good starting point for me was "C++ the Core Language" (O'Reilly). Very simple, primitive, but useful for people coming from C.

like image 24
gappy Avatar answered Sep 30 '22 20:09

gappy


My primary recommendation is to look at other packages. Needless to say, all packages don't use C code, so you will need to find examples that do. You can download the source code for all packages off CRAN, and in some instances, you can also browse them on R-Forge. Some R projects are also maintained on Google Code or sites like github (for instance, ggplot2). You will find the C code in the "src" directory.

  • Here's an example of a src directory with some C source code on R-Forge for the "survival" package.
  • Here's an example with C source code on Google Code for the "rpostgresql" package.
  • Here's an example with C source code on github for "rqtl".

In general, think about what you're trying to accomplish, and then look at packages that do similar things.

The "C Programming Language" book is probably still the most widely used, so you may want to have that on your bookshelf. The following free book is also a useful resource: http://publications.gbdirect.co.uk/c_book/

like image 9
Shane Avatar answered Sep 30 '22 20:09

Shane