Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Splitting string by delimiter in R [duplicate]

Tags:

string

split

r

I have the following string

x <- "b|all|the|experts|admit|that|we|should|legalise|drugs|b|war|in|south|osetia|pictures|made|by|a|russian|soldier|b|swedish|wrestler|ara|abrahamian|throws|away|medal|in|olympic|hissy|fit|b|russia|exaggerated|the|death|toll|in|south|ossetia|now|only|were|originally|killed|compared|to|b|missile|that|killed|inside|pakistan|may|have|been|launched|by|the|cia|b|rushdie|condemns|random|house|s|refusal|to|publish|novel|for|fear|of|muslim|retaliation|b|poland|and|us|agree|to|missle|defense|deal|interesting|timing|b|will|the|russians|conquer|tblisi|bet|on|it|no|seriously|you|can|bet|on|it|b|russia|exaggerating|south|ossetian|death|toll|says|human|rights|group|b|musharraf|expected|to|resign|rather|than|face|impeachment|b|moscow|made|plans|months|ago|to|invade|georgia|b|why|russias|response|to|georgia|was|right|b|nigeria|has|handed|over|the|potentially|oil|rich|bakassi|peninsula|to|cameroon|b|the|us|and|poland|have|agreed|a|preliminary|deal|on|plans|for|the|controversial|us|defence|shield"

When I try to split this using

> strsplit(x,"|")
[[1]]
  [1] "b" "|" "a" "l" "l" "|" "t" "h" "e" "|" "e" "x" "p" "e" "r" "t" "s" "|" "a" "d" "m" "i" "t" "|" "t" "h" "a" "t" "|"
 [30] "w" "e" "|" "s" "h" "o" "u" "l" "d" "|" "l" "e" "g" "a" "l" "i" "s" "e" "|" "d" "r" "u" "g" "s" "|" "b" "|" "w" "a"
 [59] "r" "|" "i" "n" "|" "s" "o" "u" "t" "h" "|" "o" "s" "e" "t" "i" "a" "|" "p" "i" "c" "t" "u" "r" "e" "s" "|" "m" "a"
 [88] "d" "e" "|" "b" "y" "|" "a" "|" "r" "u" "s" "s" "i" "a" "n" "|" "s" "o" "l" "d" "i" "e" "r" "|" "b" "|" "s" "w" "e"
[117] "d" "i" "s" "h" "|" "w" "r" "e" "s" "t" "l" "e" "r" "|" "a" "r" "a" "|" "a" "b" "r" "a" "h" "a" "m" "i" "a" "n" "|"
[146] "t" "h" "r" "o" "w" "s" "|" "a" "w" "a" "y" "|" "m" "e" "d" "a" "l" "|" "i" "n" "|" "o" "l" "y" "m" "p" "i" "c" "|"
[175] "h" "i" "s" "s" "y" "|" "f" "i" "t" "|" "b" "|" "r" "u" "s" "s" "i" 
.........

However I want the words seperated by the delimiter |. Where am I going wrong?

like image 432
Rajarshi Bhadra Avatar asked Feb 18 '17 20:02

Rajarshi Bhadra


People also ask

How do you split a delimiter in R?

Use str_split to Split String by Delimiter in R Alternatively, the str_split function can also be utilized to split string by delimiter. str_split is part of the stringr package. It almost works in the same way as strsplit does, except that str_split also takes regular expressions as the pattern.

How do I split a string into multiple columns in R?

To split a column into multiple columns in the R Language, We use the str_split_fixed() function of the stringr package library. The str_split_fixed() function splits up a string into a fixed number of pieces.

What is duplicated function in R?

The R function duplicated() returns a logical vector where TRUE specifies which elements of a vector or data frame are duplicates.

How do I remove duplicate names in R?

So, how do you remove duplicate column names in R? The easiest way to remove repeated column names from a data frame is by using the duplicated() function. This function (together with the colnames() function) indicates for each column name if it appears more than once.


1 Answers

This character that you are using has special meaning in regular expressions - it means OR. So your split pattern is like this:

empty string OR empty string == empty string

and that's why your input string is splitted char by char. To use this as normal character without special regular expression meaning you have to escape it, like this:

strsplit(x, "\\|")
like image 87
bartektartanus Avatar answered Sep 22 '22 01:09

bartektartanus