Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is a regular expression for removing spaces between uppercase letters, but keeps spaces between words?

For example, if I have a string like "Hello I B M", how do I detect the space between the uppercase letters but not between the "o" and the "I"?

Basically "Hello I B M" should resolve to "Hello IBM"

So far, I have this:

value = "Hello I B M"
value = value.replace(/([A-Z])\s([A-Z])/g, '$1$2')

But it only replaces the first instance of a space between two uppercase letters like: "Hello IB M"

--EDIT--

Solution Part 1:

 value = value.replace(/([A-Z])\s(?=[A-Z])/g, '$1')

Thanks to Renato for the first part of the solution! Just found out if there is a capitalized word AFTER an uppercase letter, it swallows that space as well. How do we preserver the space there?

So "Hello I B M Dude" becomes "Hello IBMDude" instead of "Hello IBM Dude"

like image 300
Steven Yuan Avatar asked Nov 05 '13 23:11

Steven Yuan


People also ask

How do you remove spaces between words in regex?

Java regex remove spaces In Java, we can use regex \\s+ to match whitespace characters, and replaceAll("\\s+", " ") to replace them with a single space.

How do you keep a space in regex?

The most common forms of whitespace you will use with regular expressions are the space (␣), the tab (\t), the new line (\n) and the carriage return (\r) (useful in Windows environments), and these special characters match each of their respective whitespaces.

How do you remove spaces from a string?

The replaceAll() method of the String class replaces each substring of this string that matches the given regular expression with the given replacement. You can remove white spaces from a string by replacing " " with "".

How do you cut spaces between words in JavaScript?

JavaScript String trim()The trim() method removes whitespace from both sides of a string. The trim() method does not change the original string.


1 Answers

When the regex matches the first time (on "A B"), this part of the string in consumed by the engine, so it is not matched again, even though your regex has the global ('g') flag.

You could achieve the expected result by using a positive lookahead ((?=PATTERN)) instead, that won't consume the match:

value = "Hello I B M"
value = value.replace(/([A-Z])\s(?=[A-Z])/g, '$1')
console.log(value) // Prints "Hello IBM"

To make it not remove the space if the next uppercase letter is the first in a word, you can increment the lookahead pattern with using a word boundary \b to make that restriction:

value = "Hello I B M Dude"
value = value.replace(/([A-Z])\s(?=[A-Z]\b)/g, '$1')
console.log(value) // Prints "Hello IBM Dude"

Note: As @CasimirHyppolite noted, the following letter has to be made optional, or the second regex won't work if the last character of the string is uppercase. Thus, the pattern ([^A-Za-z]|$), which can be read as "not a letter, or the end of the string".

Edit: Simplify lookahead from (?=[A-Z]([^A-Za-z]|$)) to (?=[A-Z]\b) as suggested by @hwnd

like image 191
Renato Zannon Avatar answered Oct 09 '22 15:10

Renato Zannon