For example, if I have a string like "Hello I B M", how do I detect the space between the uppercase letters but not between the "o" and the "I"?
Basically "Hello I B M" should resolve to "Hello IBM"
So far, I have this:
value = "Hello I B M"
value = value.replace(/([A-Z])\s([A-Z])/g, '$1$2')
But it only replaces the first instance of a space between two uppercase letters like: "Hello IB M"
--EDIT--
Solution Part 1:
value = value.replace(/([A-Z])\s(?=[A-Z])/g, '$1')
Thanks to Renato for the first part of the solution! Just found out if there is a capitalized word AFTER an uppercase letter, it swallows that space as well. How do we preserver the space there?
So "Hello I B M Dude" becomes "Hello IBMDude" instead of "Hello IBM Dude"
Java regex remove spaces In Java, we can use regex \\s+ to match whitespace characters, and replaceAll("\\s+", " ") to replace them with a single space.
The most common forms of whitespace you will use with regular expressions are the space (␣), the tab (\t), the new line (\n) and the carriage return (\r) (useful in Windows environments), and these special characters match each of their respective whitespaces.
The replaceAll() method of the String class replaces each substring of this string that matches the given regular expression with the given replacement. You can remove white spaces from a string by replacing " " with "".
JavaScript String trim()The trim() method removes whitespace from both sides of a string. The trim() method does not change the original string.
When the regex matches the first time (on "A B"
), this part of the string in consumed by the engine, so it is not matched again, even though your regex has the global ('g'
) flag.
You could achieve the expected result by using a positive lookahead ((?=PATTERN)
) instead, that won't consume the match:
value = "Hello I B M"
value = value.replace(/([A-Z])\s(?=[A-Z])/g, '$1')
console.log(value) // Prints "Hello IBM"
To make it not remove the space if the next uppercase letter is the first in a word, you can increment the lookahead pattern with using a word boundary \b
to make that restriction:
value = "Hello I B M Dude"
value = value.replace(/([A-Z])\s(?=[A-Z]\b)/g, '$1')
console.log(value) // Prints "Hello IBM Dude"
Note: As @CasimirHyppolite noted, the following letter has to be made optional, or the second regex won't work if the last character of the string is uppercase. Thus, the pattern ([^A-Za-z]|$)
, which can be read as "not a letter, or the end of the string".
Edit: Simplify lookahead from (?=[A-Z]([^A-Za-z]|$))
to (?=[A-Z]\b)
as suggested by @hwnd
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With