I have a string vector that looks like:
> string_vec
[1] "XXX" "Snakes On A Plane" "Mask of the Ninja" "Ruslan"
[5] "Kill Switch" "Buddy Holly Story, The" "Believers, The" "Closet, The"
[9] "Eyes of Tammy Faye, The" "Gymnast, The" "Hunger, The"
There are some names which contain ", The" in the end. I want to delete the comma and the space and move the "The" before all other text.
For e.g.: "Buddy Holly Story, The" becomes "The Buddy Holly Story".
Isolating the records with the pattern was easy :
string_vec[grepl("[Aa-zZ]+, The", string_vec) == TRUE]
How can I adjust the position now?
string_vec <- c("XXX", "Snakes On A Plane", "Mask of the Ninja",
"Ruslan",
"Kill Switch", "Buddy Holly Story, The", "Believers, The",
"Closet, The",
"Eyes of Tammy Faye, The", "Gymnast, The", "Hunger, The")
The substring function in R can be used either to extract parts of character strings, or to change the values of parts of character strings. substring of a vector or column in R can be extracted using substr() function. To extract the substring of the column in R we use functions like substr() and substring().
sub() function is used to replace the first occurrence of a character with another character on a string column. Elements of input specified column which are not substituted will be returned unchanged. The result of the sub() function is assigned back to the same column (vector).
You may try
sub('^(.*), The', 'The \\1', string_vec)
#[1] "XXX" "Snakes On A Plane" "Mask of the Ninja"
#[4] "Ruslan" "Kill Switch" "The Buddy Holly Story"
#[7] "The Believers" "The Closet" "The Eyes of Tammy Faye"
#[10] "The Gymnast" "The Hunger"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With