Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Concatenating all rows within a group using dplyr

Tags:

r

dplyr

tidyr

Suppose I have a dataframe like this:

hand_id card_id card_name card_class
A       1       p          alpha
A       2       q          beta
A       3       r          theta
B       2       q          beta
B       3       r          theta
B       4       s          gamma
C       1       p          alpha
C       2       q          beta 

I would like to concatenate the card_id, card_name, and card_class into one single row per hand level A, B, C. So the result would look something like this:

hand_id  combo_1  combo_2  combo_3
A        1-2-3    p-q-r    alpha-beta-theta
B        2-3-4    q-r-s    beta-theta-gamma
....

I attempted to do this using group_by and mutate, but I can't seem to get it to work

    data <- read_csv('data.csv')
    byHand <- group_by(data, hand_id) %>%
      mutate(combo_1 = paste(card_id), 
             combo_2 = paste(card_name),
             combo_3 = paste(card_class))

Thank you for your help.

like image 522
user7016618 Avatar asked Oct 14 '16 01:10

user7016618


People also ask

Can you concatenate rows in R?

1 Answer. To concatenate two data frames, you can use the rbind() function to bind the rows as follows: Note: Column names and the number of columns of the two data frames should be the same.

How do I combine multiple strings into one?

Concatenation is the process of appending one string to the end of another string. You concatenate strings by using the + operator. For string literals and string constants, concatenation occurs at compile time; no run-time concatenation occurs. For string variables, concatenation occurs only at run time.

How do I group by multiple columns in R?

Group By Multiple Columns in R using dplyrUse group_by() function in R to group the rows in DataFrame by multiple columns (two or more), to use this function, you have to install dplyr first using install. packages('dplyr') and load it using library(dplyr) . All functions in dplyr package take data.


2 Answers

You were kind of close!

library(tidyr)
library(dplyr)

data <- read_csv('data.csv')
byHand <- group_by(data, hand_id) %>%
    summarise(combo_1 = paste(card_id, collapse = "-"), 
              combo_2 = paste(card_name, collapse = "-"),
              combo_3 = paste(card_class, collapse = "-"))

or using summarise_each:

 byHand <- group_by(data, hand_id) %>%
        summarise_each(funs(paste(., collapse = "-")))
like image 68
zacdav Avatar answered Nov 24 '22 18:11

zacdav


Here is another option using data.table

library(data.table)
setDT(data)[, lapply(.SD, paste, collapse="-") , by = hand_id]
#     hand_id card_id card_name       card_class
#1:       A   1-2-3     p-q-r alpha-beta-theta
#2:       B   2-3-4     q-r-s beta-theta-gamma
#3:       C     1-2       p-q       alpha-beta
like image 39
akrun Avatar answered Nov 24 '22 18:11

akrun