Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Group by operarion in R

I have a data-set having millions of rows and i need to apply the 'group by' operation in it using R.

The data is of the form

V1 V2 V3
a  u  1
a  v  2
b  w  3
b  x  4
c  y  5
c  z  6

performing 'group by' using R, I want to add up the values in column 3 and concatenate the values in column 2 like

V1 V2 V3
a uv 3
b wx 7
c yz 11

I have tried doing the opertaion in excel but due to a lot of tuples i can't use excel. I am new to R so any help would be appreciated.

like image 358
Sankalp Avatar asked Dec 01 '22 00:12

Sankalp


1 Answers

Many possible ways to solve, here are two

library(data.table)
setDT(df)[, .(V2 = paste(V2, collapse = ""), V3 = sum(V3)), by = V1]
#    V1 V2 V3
# 1:  a uv  3
# 2:  b wx  7
# 3:  c yz 11

Or

library(dplyr)
df %>%
  group_by(V1) %>%
  summarise(V2 = paste(V2, collapse = ""), V3 = sum(V3))

# Source: local data table [3 x 3]
# 
#   V1 V2 V3
# 1  a uv  3
# 2  b wx  7
# 3  c yz 11

Data

df <- structure(list(V1 = structure(c(1L, 1L, 2L, 2L, 3L, 3L), .Label = c("a", 
"b", "c"), class = "factor"), V2 = structure(1:6, .Label = c("u", 
"v", "w", "x", "y", "z"), class = "factor"), V3 = 1:6), .Names = c("V1", 
"V2", "V3"), class = "data.frame", row.names = c(NA, -6L))
like image 53
David Arenburg Avatar answered Dec 05 '22 03:12

David Arenburg