I want to summarise several variables in data.table, output in wide format, output possibly as a list per variable. Since several other approaches did not work, I tried to do an outer lapply, giving the names of the variables as character vectors. I wanted to pass these in, using with=FALSE.
carsx=as.data.table(cars)
lapply( list(speed="speed",dist= "dist"), #error object 'ansvals' not found
function(x) carsx[,list(mean(x), min(x), max(x) ), with=FALSE ] )
Since this does not work, I tried the more simple approach without lapply.
carsx[,list(mean("speed"), min("speed"), max("speed") ), with=FALSE ] #error object 'ansvals' not found
This does not work either. Is there any way to do something like this? Is this behaviour of 'with' wanted? (I am aware that ?data.table
mentions with only to select columns, but in my case it would be useful to be able to transform them as well)
When with=FALSE, j is a vector of names or positions to select, similar to a data.frame. with=FALSE is often useful in data.table to select columns dynamically.
EDIT My aim is to get a summary per group in wide format, for different variables. I tried to extend the following, which works only for one variable, for a list of variables.
carsx[,list(mean(speed), min(speed), max(speed) ) ,by=(dist>50)
Lamentably SO doesnt let me post my other question. There I described that I want an output similiar to:
lapply( list(speed="speed",dist= "dist"),
function(x) do.call("as.data.frame", aggregate(cars[,x], list(class=cars$dist>50), FUN=summary) ) )
Expected Output would be something like:
$speed
V1 V2 V3
1: FALSE 12.96970 4 20
2: TRUE 20.11765 14 25
$dist
V1 V2 V3
1: FALSE 12.96970 4 20
2: TRUE 20.11765 14 25
data.table is an R package that provides an enhanced version of data.frame s, which are the standard data structure for storing data in base R. In the Data section above, we already created a data.table using fread() . We can also create one using the data.table() function.
transform() function in R Language is used to modify data. It converts the first argument to the data frame. This function is used to transform/modify the data frame in a quick and easy way.
dcast: Convert data between wide and long forms.
You can specify the columns with the .SDcols
parameter:
carsx[ , lapply(.SD, function(x) c(mean(x), min(x), max(x))),
.SDcols = c("speed", "dist")]
# speed dist
# 1: 15.4 42.98
# 2: 4.0 2.00
# 3: 25.0 120.00
carsx[ , lapply(.SD, function(x) c(mean(x), min(x), max(x))),
.SDcols = "speed"]
# speed
# 1: 15.4
# 2: 4.0
# 3: 25.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With