I have a data-set having millions of rows and i need to apply the 'group by' operation in it using R.
The data is of the form
V1 V2 V3
a u 1
a v 2
b w 3
b x 4
c y 5
c z 6
performing 'group by' using R, I want to add up the values in column 3 and concatenate the values in column 2 like
V1 V2 V3
a uv 3
b wx 7
c yz 11
I have tried doing the opertaion in excel but due to a lot of tuples i can't use excel. I am new to R so any help would be appreciated.
Many possible ways to solve, here are two
library(data.table)
setDT(df)[, .(V2 = paste(V2, collapse = ""), V3 = sum(V3)), by = V1]
# V1 V2 V3
# 1: a uv 3
# 2: b wx 7
# 3: c yz 11
Or
library(dplyr)
df %>%
group_by(V1) %>%
summarise(V2 = paste(V2, collapse = ""), V3 = sum(V3))
# Source: local data table [3 x 3]
#
# V1 V2 V3
# 1 a uv 3
# 2 b wx 7
# 3 c yz 11
Data
df <- structure(list(V1 = structure(c(1L, 1L, 2L, 2L, 3L, 3L), .Label = c("a",
"b", "c"), class = "factor"), V2 = structure(1:6, .Label = c("u",
"v", "w", "x", "y", "z"), class = "factor"), V3 = 1:6), .Names = c("V1",
"V2", "V3"), class = "data.frame", row.names = c(NA, -6L))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With