This question follows from a previous question. Instead of having two columns, what if we have three or more columns? Consider the following data.
x <- c(600, 600, 600, 600, 600, 600, 600, 600, 600, 800, 800, 800, 800, 800, 800, 800, 800, 800,
600, 600, 600, 600, 600, 600, 600, 600, 600, 800, 800, 800, 800, 800, 800, 800, 800, 800,
600, 600, 600, 600, 600, 600, 600, 600, 600, 800, 800, 800, 800, 800, 800, 800, 800, 800)
y <- c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80, 80,
3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3)
z <- c(1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3,
1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3,
1, 2, 3, 1, 2, 3)
xyz <- data.frame(cbind(x, y, z))
If we treat all columns as factor with finite number of levels. What I want to get is the number of observations in each unique combination of x, y and z. The answer is 18 unique combinations with 3 observations in each combination. How can I do this in R, please? Thank you!
Using table or tabulate with interaction
tabulate(with(xyz, interaction(x,y,z)))
table(with(xyz, interaction(x,y,z)))
or split by the interaction and use lengths,
lengths(split(xyz, with(xyz, interaction(x,y,z))))
or
aggregate(seq_along(x)~ x+y+z, data=xyz, FUN=length)
An option using data.table. We convert the 'data.frame' to 'data.table' (setDT(xyz), grouped by the columns of 'xyz', get the number of elements in each group (.N)
library(data.table)
setDT(xyz)[, .N, names(xyz)]$N
#[1] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
Or with dplyr, we group by the columns, get the number of elements (n()) using summarise.
library(dplyr)
xyz %>%
group_by_(.dots=names(xyz)) %>%
summarise(n=n()) %>%
.$n
#[1] 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With