Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 
avatar of David Arenburg

David Arenburg

David Arenburg has asked 26 questions and find answers to 538 problems.

Stats

15.5k
EtPoint
5.3k
Vote count
26
questions
538
answers

About

profile for David Arenburg on Stack Exchange, a network of free, community-driven Q&A sites

Some rules of thumb for new R users (According to myself):

  1. If you are working with data.frames, forget there is a function called apply- whatever you do - don't use it. Especially with a margin of 1 (the only good usecase for this function is to operate over matrix columns- margin of 2).

    • Some good alternatives: ?do.call, ?pmax/pmin, ?max.col, ?rowSums/rowMeans/etc, the awesome matrixStats packages (for matrices), ?rowsum and many more
  2. For loops are not bad- don't listen to anyone who says otherwise. They are bad only in certain cases:

    • If you use them to iterate over rows.
    • If you are performing unvectorized/inefficient operation within each iteration
    • If you are writing a loop for something that is already vectorized
  3. R is a vectorized language- meaning many operation were already written in C loops- so don't reinvent the wheel and write stuff in R loops if it was already written. With one exception- many of these functions work only with matrices. Hence, if you have a data.frame you should think twice if you want it to be converted to a matrix(you may experience some unexpected consequences as a result), or can you avoid it.

  4. Learn base R before you learn any fancy packages such as dplyr. It is a nice package and all, but it was designed for very specific things. Many many operations could be done much more efficiently using base R.

  5. Get familiar with R classes. Learn what is factor and how to use it. Know the difference between a matrix (a vector with a dim attribute) and a data.frame (a list of vectors). Learn how and when to work with lists or arrays. Know the difference between numeric and integer. Read about floating points.

  6. Learn how and when yo use lapply/sapply/vapply - these could come useful many times

  7. You must learn some ?regex. Must.

  8. Read ?S4groupGeneric in order to discover which functions have data.frame methods (a very useful to know).

  9. Learn about ?methods

  10. Read ?strptime very carefully (note the Sys.setlocale("LC_TIME", "C") part - could be a life saver).

  11. Read the damn docs. R has awesome documentation- please use it. You won't find anything even nearly as good in any other language (I know of).

Like Barry Rowlingson once said: "This is all documented in TFM. Those who WTFM don't want to have to WTFM again on the mailing list. RTFM."