Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

reshape vs. reshape2 in R

I am attempting to understand why development had shifted from reshape to reshape2 package. They seem to be functionally the same, however, I am unable to upgrade to reshape2 currently due to an older version of R running on the server. I am concerned about the possibility of a major bug that would have shifted development to a whole new package instead of simply continuing development of reshape. Does anyone know if there is a major flaw in the reshape package?

like image 729
Alex Avatar asked Sep 11 '12 20:09

Alex


People also ask

What is reshape2 R?

reshape2 is an R package written by Hadley Wickham that makes it easy to transform data between wide and long formats.

What is reshape 2?

Reshape2 is a package that allows us to easily transform our data into whatever structure we may need. Many of us are used to seeing our data structured so that corresponds to a single participant and each column corresponds to a variable. This type of data structure is known as wide format.

What is dcast function in R?

The dcast formula takes the form LHS ~ RHS , ex: var1 + var2 ~ var3 . The order of entries in the formula is essential. There are two special variables: . and ... . . represents no variable; ... represents all variables not otherwise mentioned in formula. LHS variable values will be in rows.


1 Answers

reshape2 let Hadley make a rebooted reshape that was way, way faster, while avoiding busting up people's dependencies and habits.

https://stat.ethz.ch/pipermail/r-packages/2010/001169.html

Reshape2 is a reboot of the reshape package. It's been over five years since the first release of the package, and in that time I've learned a tremendous amount about R programming, and how to work with data in R. Reshape2 uses that knowledge to make a new package for reshaping data that is much more focussed and much much faster.

This version improves speed at the cost of functionality, so I have renamed it to reshape2 to avoid causing problems for existing users. Based on user feedback I may reintroduce some of these features.

What's new in reshape2:

  • considerably faster and more memory efficient thanks to a much better underlying algorithm that uses the power and speed of subsetting to the fullest extent, in most cases only making a single copy of the data.

  • cast is replaced by two functions depending on the output type: dcast produces data frames, and acast produces matrices/arrays.

  • multidimensional margins are now possible: grand_row and grand_col have been dropped: now the name of the margin refers to the variable that has its value set to (all).

  • some features have been removed such as the | cast operator, and the ability to return multiple values from an aggregation function. I'm reasonably sure both these operations are better performed by plyr.

  • a new cast syntax which allows you to reshape based on functions
    of variables (based on the same underlying syntax as plyr):

  • better development practices like namespaces and tests.

like image 168
Matt Parker Avatar answered Sep 22 '22 15:09

Matt Parker