Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split column string elements within a row inside a dataframe

Tags:

split

dataframe

r

I have a matrix (1000 x 2830) like this:

        9178    3574    3547
160     B_B     B_B      A_A
301     B_B     A_B      A_B
303     B_B     B_B      A_A
311     A_B     A_B      A_A
312     B_B     A_B      A_A
314     B_B     A_B      A_A

and I want to obtain the following (duplicating colnames and splitting each element of each column):

      9178   9178   3574   3574   3547   3547
160     B      B      B      B      A      A
301     B      B      A      B      A      B
303     B      B      B      B      A      A
311     A      B      A      B      A      A
312     B      B      A      B      A      A
314     B      B      A      B      A      A

I tried using strsplit but I got error messages because this is a matrix, not a string. Could you please provide some ideas for resolving this?

like image 772
July Avatar asked Feb 19 '15 13:02

July


1 Answers

Here's an option using dplyr (for bind_cols) and tidyr (for separate_) together with lapply from base R. It assumes that your data is a data.frame (i.e. you might need to convert it to data.frame first):

library(dplyr)
library(tidyr)

lapply(names(df), function(x) separate_(df[x], x, paste0(x,"_",1:2), sep = "_" )) %>% 
  bind_cols
#  X9178_1 X9178_2 X3574_1 X3574_2 X3547_1 X3547_2
#1       B       B       B       B       A       A
#2       B       B       A       B       A       B
#3       B       B       B       B       A       A
#4       A       B       A       B       A       A
#5       B       B       A       B       A       A
#6       B       B       A       B       A       A
like image 121
talat Avatar answered Sep 28 '22 00:09

talat