I have a character column in my dataframe that looks like <pre class="prettyprint"><code>df<- data.frame(a=c("AaBbCC","AABBCC","AAbbCC"))#df a 1 AaBbCC 2 AABBCC 3 AAbbCC </code></pre> I would like to split this column every two characters. So in this case I would like to obtain three columns named <code>VA,VB,VC</code>. I tried <pre class="prettyprint"><code>library(tidyr) library(dplyr) df<- data.frame(a=c("AaBbCC","AABBCC","AAbbCC"))%>% separate(a,c(paste("V",LETTERS[1:3],sep="")),sep=c(2,2)) VA VB VC 1 Aa BbCC 2 AA BBCC 3 AA bbCC </code></pre> but this is not the desired result. I like to have the result that is now in <code>VC</code> split into <code>VB</code> (all letter B) and <code>VC</code> (all letter C)How do I get R to split every two characters. The length of the string in the column is always the same for every row (6 in this example). I will have strings that are of length >10.

You were actually quite close. You need to specify the separator-positions as <code>sep = c(2,4)</code> instead of <code>sep = c(2,2)</code>: <pre class="prettyprint"><code>df <- separate(df, a, c(paste0("V",LETTERS[1:3])), sep = c(2,4)) </code></pre> you get: <blockquote> <pre class="prettyprint"><code>> df VA VB VC 1 Aa Bb CC 2 AA BB CC 3 AA bb CC </code></pre> </blockquote> <hr> In base R you could do (borrowing from @rawr's comment): <pre class="prettyprint"><code>l <- ave(as.character(df$a), FUN = function(x) strsplit(x, '(?<=..)', perl = TRUE)) df <- data.frame(do.call('rbind', l)) </code></pre> which gives: <blockquote> <pre class="prettyprint"><code>> df X1 X2 X3 1 Aa Bb CC 2 AA BB CC 3 AA bb CC </code></pre> </blockquote>

We could do this with <code>base R</code> <pre class="prettyprint"><code>read.csv(text=gsub('(..)(?!$)', '\\1,', df$a, perl=TRUE),col.names=paste0("V", LETTERS[1:3]), header=FALSE) # VA VB VC #1 Aa Bb CC #2 AA BB CC #3 AA bb CC </code></pre> If we are reading directly from the file, another option is <code>read.fwf</code> <pre class="prettyprint"><code>read.fwf(file="yourfile.txt", widths=c(2,2,2), skip=1) </code></pre>

Split character string multiple times every two characters

I have a character column in my dataframe that looks like

df<-
  data.frame(a=c("AaBbCC","AABBCC","AAbbCC"))#df
       a
1 AaBbCC
2 AABBCC
3 AAbbCC

I would like to split this column every two characters. So in this case I would like to obtain three columns named VA,VB,VC. I tried

library(tidyr)
library(dplyr)
df<-
  data.frame(a=c("AaBbCC","AABBCC","AAbbCC"))%>%
  separate(a,c(paste("V",LETTERS[1:3],sep="")),sep=c(2,2))
 VA VB   VC
1 Aa    BbCC
2 AA    BBCC
3 AA    bbCC

but this is not the desired result. I like to have the result that is now in VC split into VB (all letter B) and VC (all letter C)How do I get R to split every two characters. The length of the string in the column is always the same for every row (6 in this example). I will have strings that are of length >10.

Can a string be split on multiple characters?

Method 1: Split multiple characters from string using re. split() This is the most efficient and commonly used method to split multiple characters at once. It makes use of regex(regular expressions) in order to do this.

How do you split a string with every nth character?

To split a string every n characters: Import the wrap() method from the textwrap module. Pass the string and the max width of each slice to the method. The wrap() method will split the string into a list with items of max length N.

How do I split a string into multiple strings?

split() The method split() splits a String into multiple Strings given the delimiter that separates them. The returned object is an array which contains the split Strings. We can also pass a limit to the number of elements in the returned array.

How do you split a string by multiple delimiters?

There are multiple ways you can split a string or strings of multiple delimiters in python. The most and easy approach is to use the split() method, however, it is meant to handle simple cases.

You were actually quite close. You need to specify the separator-positions as sep = c(2,4) instead of sep = c(2,2):

df <- separate(df, a, c(paste0("V",LETTERS[1:3])), sep = c(2,4))

you get:

> df
  VA VB VC
1 Aa Bb CC
2 AA BB CC
3 AA bb CC

In base R you could do (borrowing from @rawr's comment):

l <- ave(as.character(df$a), FUN = function(x) strsplit(x, '(?<=..)', perl = TRUE))
df <- data.frame(do.call('rbind', l))

which gives:

> df
  X1 X2 X3
1 Aa Bb CC
2 AA BB CC
3 AA bb CC

We could do this with base R

read.csv(text=gsub('(..)(?!$)', '\\1,', df$a, 
    perl=TRUE),col.names=paste0("V", LETTERS[1:3]), header=FALSE)
#  VA VB VC
#1 Aa Bb CC
#2 AA BB CC
#3 AA bb CC

If we are reading directly from the file, another option is read.fwf

read.fwf(file="yourfile.txt", widths=c(2,2,2), skip=1)

Split character string multiple times every two characters

Tags:

string

dataframe

r

tidyr

user2386786

People also ask

2 Answers

Jaap

akrun

Recent Activity

Donate For Us

Split character string multiple times every two characters

Tags:

string

dataframe

r

tidyr

user2386786

People also ask

2 Answers

Jaap

akrun

Related questions

Recent Activity

Donate For Us