Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Manipulate string in column of a dataframe

Tags:

dataframe

r

I have a data frame

a = data.frame("a" = c("aaa|abbb", "bbb|aaa", "bbb|aaa|ccc"), "b" = c(1,2,3))

     a       b
 aaa|abbb    1
 bbb|aaa     2
 bbb|aaa|ccc 3

I want to split the colum value by "|" and sort the output and merge them together to look like this

     a       b
 aaa|abbb    1
 aaa|bbb     2
|aaa|bbb|ccc 3

I tried to use following

paste(sort(ignore.case(unlist(strsplit(as.character(a$a), "\\|")))),collapse = ", ")

but that just combine everything together. How can I implement it on each value of column A and get the result as dataframe. I tried to use lapply but still got the same result, one combined list.

like image 423
user1631306 Avatar asked Mar 06 '19 15:03

user1631306


People also ask

How do I change a character in a column in pandas?

We can replace characters using str. replace() method is basically replacing an existing string or character in a string with a new one. we can replace characters in strings is for the entire dataframe as well as for a particular column.

How do you replace a substring in a DataFrame column?

You can replace substring of pandas DataFrame column by using DataFrame. replace() method. This method by default finds the exact sting match and replaces it with the specified value. Use regex=True to replace substring.


1 Answers

We could use separate_rows to split the values in 'a', then grouped by 'b', sort 'a' and paste the elements together

library(tidyverse)
a %>% 
 separate_rows(a) %>% 
 group_by(b) %>% 
 summarise(a = paste(sort(a), collapse="|")) %>%
 select(names(a))
# A tibble: 3 x 2
#  a               b
#  <chr>       <dbl>
#1 aaa|abbb        1
#2 aaa|bbb         2
#3 aaa|bbb|ccc     3
like image 120
akrun Avatar answered Oct 04 '22 20:10

akrun