Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R: Merge of rows in same data table, concatenating certain columns

I have my data table in R. I want to merge rows which have an identical customerID, and then concatenate the elements of other merged columns.

I want to go from this:

   title  author customerID
1 title1 author1          1
2 title2 author2          2
3 title3 author3          1

to this:

           title           author Group.1
1 title1, title3 author1, author3       1
2         title2          author2       2
like image 604
Harry Palmer Avatar asked Jul 06 '12 15:07

Harry Palmer


People also ask

How do I merge rows in the same table in R?

To merge two data frames (datasets) horizontally, use the merge() function in the R language. To bind or combine rows in R, use the rbind() function. The rbind() stands for row binding.

How do I combine multiple rows of data into one row?

How to Convert Multiple Rows to Single Row using the Ampersand Sign. With the Ampersand sign “&” you can easily combine multiple rows into a single cell. Following this trick, you can join multiple texts with space as a separator. Here, in this case, B4, B5, and B6 are for the texts.

How do I concatenate two columns in R?

How do I concatenate two columns in R? To concatenate two columns you can use the <code>paste()</code> function. For example, if you want to combine the two columns A and B in the dataframe df you can use the following code: <code>df['AB'] <- paste(df$A, df$B)</code>.

How do I merge rows with the same data?

First, select the rows you want to merge then open the Home tab and expand Merge & Centre. From these options select Merge Cells. After selecting Merge Cells it will pop up a message which values it is going to keep. Then click on OK.


2 Answers

The aggregate function should help you in finding a solution:

dat = data.frame(title = c("title1", "title2", "title3"),
                 author = c("author1", "author2", "author3"),
                 customerID = c(1, 2, 1))
aggregate(dat[-3], by=list(dat$customerID), c)
#   Group.1 title author
# 1       1  1, 3   1, 3
# 2       2     2      2

Or, just make sure you add stringsAsFactors = FALSE when you are creating your data frame and you're pretty much good to go. If your data are already factored, you can use something like dat[c(1, 2)] = apply(dat[-3], 2, as.character) to convert them to character first, then:

aggregate(dat[-3], by=list(dat$customerID), c)
#   Group.1          title           author
# 1       1 title1, title3 author1, author3
# 2       2         title2          author2
like image 194
A5C1D2H2I1M1N2O1R2T1 Avatar answered Sep 27 '22 23:09

A5C1D2H2I1M1N2O1R2T1


Maybe not the best solution but easy to understand:

df <- data.frame(author=LETTERS[1:5], title=LETTERS[1:5], id=c(1, 2, 1, 2, 3), stringsAsFactors=FALSE)

uniqueIds <- unique(df$id)

mergedDf <- df[1:length(uniqueIds),]

for (i in seq(along=uniqueIds)) {
    mergedDf[i, "id"] <- uniqueIds[i]
    mergedDf[i, "author"] <- paste(df[df$id == uniqueIds[i], "author"], collapse=",")
    mergedDf[i, "title"] <- paste(df[df$id == uniqueIds[i], "title"], collapse=",")
}

mergedDf
#  author title id
#1    A,C   A,C  1
#2    B,D   B,D  2
#3      E     E  3
like image 45
sgibb Avatar answered Sep 28 '22 01:09

sgibb