I am attempting to understand why development had shifted from reshape
to reshape2
package. They seem to be functionally the same, however, I am unable to upgrade to reshape2
currently due to an older version of R running on the server. I am concerned about the possibility of a major bug that would have shifted development to a whole new package instead of simply continuing development of reshape
. Does anyone know if there is a major flaw in the reshape
package?
reshape2 is an R package written by Hadley Wickham that makes it easy to transform data between wide and long formats.
Reshape2 is a package that allows us to easily transform our data into whatever structure we may need. Many of us are used to seeing our data structured so that corresponds to a single participant and each column corresponds to a variable. This type of data structure is known as wide format.
The dcast formula takes the form LHS ~ RHS , ex: var1 + var2 ~ var3 . The order of entries in the formula is essential. There are two special variables: . and ... . . represents no variable; ... represents all variables not otherwise mentioned in formula. LHS variable values will be in rows.
reshape2
let Hadley make a rebooted reshape
that was way, way faster, while avoiding busting up people's dependencies and habits.
https://stat.ethz.ch/pipermail/r-packages/2010/001169.html
Reshape2 is a reboot of the reshape package. It's been over five years since the first release of the package, and in that time I've learned a tremendous amount about R programming, and how to work with data in R. Reshape2 uses that knowledge to make a new package for reshaping data that is much more focussed and much much faster.
This version improves speed at the cost of functionality, so I have renamed it to
reshape2
to avoid causing problems for existing users. Based on user feedback I may reintroduce some of these features.What's new in
reshape2
:
considerably faster and more memory efficient thanks to a much better underlying algorithm that uses the power and speed of subsetting to the fullest extent, in most cases only making a single copy of the data.
cast is replaced by two functions depending on the output type:
dcast
produces data frames, andacast
produces matrices/arrays.multidimensional margins are now possible:
grand_row
andgrand_col
have been dropped: now the name of the margin refers to the variable that has its value set to (all).some features have been removed such as the
|
cast operator, and the ability to return multiple values from an aggregation function. I'm reasonably sure both these operations are better performed by plyr.a new cast syntax which allows you to reshape based on functions
of variables (based on the same underlying syntax as plyr):better development practices like namespaces and tests.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With