I need a Regular Expression to remove ALL single characters from a string, not just single letters or numbers
The string is:
"A Future Ft Casino Karate Chop ( Prod By Metro )"
it should come out as:
"Future Ft Casino Karate Chop Prod By Metro"
The expression I am using at the moment (in PHP), correctly removes the single 'A' but leaves the single '(' and ')'
This is the code I am using:
$string = preg_replace('/\b\w\b\s?/', '', $string);
$ means "Match the end of the string" (the position after the last character in the string). Both are called anchors and ensure that the entire string is matched instead of just a substring.
To remove a character in an R data frame column, we can use gsub function which will replace the character with blank. For example, if we have a data frame called df that contains a character column say x which has a character ID in each value then it can be removed by using the command gsub("ID","",as.
Basically (0+1)* mathes any sequence of ones and zeroes. So, in your example (0+1)*1(0+1)* should match any sequence that has 1. It would not match 000 , but it would match 010 , 1 , 111 etc. (0+1) means 0 OR 1.
In other words, square brackets match exactly one character. (a-z0-9) will match two characters, the first is one of abcdefghijklmnopqrstuvwxyz , the second is one of 0123456789 , just as if the parenthesis weren't there. The () will allow you to read exactly which characters were matched.
Try this:
(^| ).( |$)
Breakdown:
1. (^| ) -> Beginning of line or space
2. . -> Any character
3. ( |$) -> Space or End of line
Actual code:
$string = preg_replace('/(^| ).( |$)/', '$1', $string);
Note: I'm not familiar with the workings of PHP regex, so the code might need a slight tweak depending on how the actual regex needs declared.
As m.buettner pointed out, there will be a trailing white space here with this code. A trim would be needed to clear it out.
Edit: Arnis Juraga pointed out that this would not clear out multiple single characters a b c
would filter out to b
. If this is an issues use this regex:
(^| ).(( ).)*( |$)
The (( ).)*
added to the middle will look for any space following by any character 0 or more times. The downside is this will end up with double spaces where a series of single characters were located.
Meaning this:
The a b c dog
Will become this:
The dog
After performing the replacement to get single individual characters, you would need to use the following regex to locate the double spaces, then replace with a single space
( ){2}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With