I'm trying to use the daply
function in the plyr
package but I cannot get it to output properly. Even though the variable that makes up the matrix is numeric, the elements of the matrix are lists, not the variable itself. Here is a small subset of the data for example sake:
Month Vehicle Samples
1 Oct-10 31057 256
2 Oct-10 31059 316
3 Oct-10 31060 348
4 Nov-10 31057 267
5 Nov-10 31059 293
6 Nov-10 31060 250
7 Dec-10 31057 159
8 Dec-10 31059 268
9 Dec-10 31060 206
And I would like to be able to visualize the data in a matrix format, which would look something like this:
Month
Vehicle Oct-10 Nov-10 Dec-10
31057 256 267 159
31059 316 293 268
31060 348 250 206
Here are a couple of alternative syntax that I use (the latter because my original dataframe has more columns than I show here):
daply(DF, .(Vehicle, Month), identity)
daply(DF,.(Vehicle,Month), colwise(identity,.(Samples)))
However what I get instead is rather abstruse:
Month
Vehicle Oct-10 Nov-10 Dec-10
31057 List,3 List,3 List,3
31059 List,3 List,3 List,3
31060 List,3 List,3 List,3
I used the str
function on the output as some commenters have suggested, and here is an excerpt:
List of 9
$ :'data.frame': 1 obs. of 3 variables:
..$ Month : Ord.factor w/ 3 levels "Oct-10"<"Nov-10"<..: 1
..$ Vehicle: Factor w/ 3 levels "31057","31059",..: 1
..$ Samples: int 256
$ :'data.frame': 1 obs. of 3 variables:
..$ Month : Ord.factor w/ 3 levels "Oct-10"<"Nov-10"<..: 1
..$ Vehicle: Factor w/ 3 levels "31057","31059",..: 2
..$ Samples: int 316
What am I missing? Also, is there a way to do this simply with the base packages? Thanks!
Below is the Dput
of the data frame if you'd like to reproduce this:
structure(list(Month = structure(c(1L, 1L, 1L, 2L, 2L, 2L, 3L,
3L, 3L), .Label = c("Oct-10", "Nov-10", "Dec-10"), class = c("ordered",
"factor")), Vehicle = structure(c(1L, 2L, 3L, 1L, 2L, 3L, 1L,
2L, 3L), .Label = c("31057", "31059", "31060"), class = "factor"),
Samples = c(256L, 316L, 348L, 267L, 293L, 250L, 159L, 268L,
206L)), .Names = c("Month", "Vehicle", "Samples"), class = "data.frame", row.names = c(NA,
9L))
Convert a Data Frame into a Numeric Matrix in R Programming – data. matrix() Function. data. matrix() function in R Language is used to create a matrix by converting all the values of a Data Frame into numeric mode and then binding them as a matrix.
plyr-deprecated: Deprecated Functions in Package plyr in plyr: Tools for Splitting, Applying and Combining Data.
To convert columns of an R data frame from integer to numeric we can use lapply function. For example, if we have a data frame df that contains all integer columns then we can use the code lapply(df,as. numeric) to convert all of the columns data type into numeric data type.
plyr is an R package that makes it simple to split data apart, do stuff to it, and mash it back together. This is a common data-manipulation step. Importantly, plyr makes it easy to control the input and output data format from a syntactically consistent set of functions.
The identity
function isn't what you want here; from the help page, "All plyr functions use the same split-apply-combine strategy: they split the input into simpler pieces, apply .fun to each piece, and then combine the pieces into a single data structure." The simpler pieces in this case are subsets of the original data frame with unique Vehicle/Month combinations; the identity function just returns that subset, and these subsets are then used to fill the resulting matrix.
That is, each element of the matrix you got is a data frame (which is a type of list) with the rows with that Month/Vehicle combination.
> try1 <- daply(DF, .(Vehicle, Month), identity)
> try1[1,1]
[[1]]
Month Vehicle Samples
1 Oct-10 31057 256
You instead want to use a function that just gets the Samples
portion of that data frame, like this:
daply(DF, .(Vehicle, Month), function(x) x$Samples)
which results in
Month
Vehicle Oct-10 Nov-10 Dec-10
31057 256 267 159
31059 316 293 268
31060 348 250 206
A few alternate ways of doing this are with cast
from the reshape
package (which returns a data frame)
cast(DF, Vehicle~Month, value="Samples")
the revised version in reshape2
; the first returns a data frame, the second a matrix
dcast(DF, Vehicle~Month, value_var="Samples")
acast(DF, Vehicle~Month, value_var="Samples")
with xtabs
from the stats
package
xtabs(Samples ~ Vehicle + Month, DF)
or by hand, which isn't hard at all using matrix indexing; almost all the code is just setting up the matrix.
with(DF, {
out <- matrix(nrow=nlevels(Vehicle), ncol=nlevels(Month),
dimnames=list(Vehicle=levels(Vehicle), Month=levels(Month)))
out[cbind(Vehicle, Month)] <- Samples
out
})
The reshape
function in the stats package can also be used to do this, but the syntax is difficult and I haven't used it once since learning cast
and melt
from the reshape
package.
If we take the OP at their word(s) in the title, then they may be looking for data.matrix()
which is a standard function in the base package that is always available in R.
data.matrix()
works by converting any factors to their numeric coding before converting the data frame to a matrix. Consider the following data frame:
dat <- data.frame(A = 1:10, B = factor(sample(c("X","Y"), 10, replace = TRUE)))
If we convert via as.matrix()
we get a character matrix:
> head(as.matrix(dat))
A B
[1,] " 1" "X"
[2,] " 2" "X"
[3,] " 3" "Y"
[4,] " 4" "Y"
[5,] " 5" "Y"
[6,] " 6" "Y"
or if via matrix()
one gets a list with dimensions (a list array - as mentioned in the Value section of ?daply
by the way)
> head(matrix(dat))
[,1]
[1,] Integer,10
[2,] factor,10
> str(matrix(dat))
List of 2
$ : int [1:10] 1 2 3 4 5 6 7 8 9 10
$ : Factor w/ 2 levels "X","Y": 1 1 2 2 2 2 1 2 2 1
- attr(*, "dim")= int [1:2] 2 1
data.matrix()
, however, does the intended thing:
> mat <- data.matrix(dat)
> head(mat)
A B
[1,] 1 1
[2,] 2 1
[3,] 3 2
[4,] 4 2
[5,] 5 2
[6,] 6 2
> str(mat)
int [1:10, 1:2] 1 2 3 4 5 6 7 8 9 10 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:2] "A" "B"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With