I have the following string
x <- "b|all|the|experts|admit|that|we|should|legalise|drugs|b|war|in|south|osetia|pictures|made|by|a|russian|soldier|b|swedish|wrestler|ara|abrahamian|throws|away|medal|in|olympic|hissy|fit|b|russia|exaggerated|the|death|toll|in|south|ossetia|now|only|were|originally|killed|compared|to|b|missile|that|killed|inside|pakistan|may|have|been|launched|by|the|cia|b|rushdie|condemns|random|house|s|refusal|to|publish|novel|for|fear|of|muslim|retaliation|b|poland|and|us|agree|to|missle|defense|deal|interesting|timing|b|will|the|russians|conquer|tblisi|bet|on|it|no|seriously|you|can|bet|on|it|b|russia|exaggerating|south|ossetian|death|toll|says|human|rights|group|b|musharraf|expected|to|resign|rather|than|face|impeachment|b|moscow|made|plans|months|ago|to|invade|georgia|b|why|russias|response|to|georgia|was|right|b|nigeria|has|handed|over|the|potentially|oil|rich|bakassi|peninsula|to|cameroon|b|the|us|and|poland|have|agreed|a|preliminary|deal|on|plans|for|the|controversial|us|defence|shield"
When I try to split this using
> strsplit(x,"|")
[[1]]
[1] "b" "|" "a" "l" "l" "|" "t" "h" "e" "|" "e" "x" "p" "e" "r" "t" "s" "|" "a" "d" "m" "i" "t" "|" "t" "h" "a" "t" "|"
[30] "w" "e" "|" "s" "h" "o" "u" "l" "d" "|" "l" "e" "g" "a" "l" "i" "s" "e" "|" "d" "r" "u" "g" "s" "|" "b" "|" "w" "a"
[59] "r" "|" "i" "n" "|" "s" "o" "u" "t" "h" "|" "o" "s" "e" "t" "i" "a" "|" "p" "i" "c" "t" "u" "r" "e" "s" "|" "m" "a"
[88] "d" "e" "|" "b" "y" "|" "a" "|" "r" "u" "s" "s" "i" "a" "n" "|" "s" "o" "l" "d" "i" "e" "r" "|" "b" "|" "s" "w" "e"
[117] "d" "i" "s" "h" "|" "w" "r" "e" "s" "t" "l" "e" "r" "|" "a" "r" "a" "|" "a" "b" "r" "a" "h" "a" "m" "i" "a" "n" "|"
[146] "t" "h" "r" "o" "w" "s" "|" "a" "w" "a" "y" "|" "m" "e" "d" "a" "l" "|" "i" "n" "|" "o" "l" "y" "m" "p" "i" "c" "|"
[175] "h" "i" "s" "s" "y" "|" "f" "i" "t" "|" "b" "|" "r" "u" "s" "s" "i"
.........
However I want the words seperated by the delimiter |
. Where am I going wrong?
Use str_split to Split String by Delimiter in R Alternatively, the str_split function can also be utilized to split string by delimiter. str_split is part of the stringr package. It almost works in the same way as strsplit does, except that str_split also takes regular expressions as the pattern.
To split a column into multiple columns in the R Language, We use the str_split_fixed() function of the stringr package library. The str_split_fixed() function splits up a string into a fixed number of pieces.
The R function duplicated() returns a logical vector where TRUE specifies which elements of a vector or data frame are duplicates.
So, how do you remove duplicate column names in R? The easiest way to remove repeated column names from a data frame is by using the duplicated() function. This function (together with the colnames() function) indicates for each column name if it appears more than once.
This character that you are using has special meaning in regular expressions - it means OR. So your split pattern is like this:
empty string OR empty string == empty string
and that's why your input string is splitted char by char. To use this as normal character without special regular expression meaning you have to escape it, like this:
strsplit(x, "\\|")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With