I'm not very good at RegEx, can someone give me a regex (to use in Java) that will select all whitespace that isn't between two quotes? I am trying to remove all such whitespace from a string, so any solution to do so will work.
For example:
(this is a test "sentence for the regex")
should become
(thisisatest"sentence for the regex")
Using Regular Expression The best way to find all whitespaces and replace them with an empty string is using regular expressions. A white space is denoted with “\\s” in regex. All we have to find all such occurrences and replace them with an empty string. Use "\\s+" if there are more than one consecutive whitespaces.
The replaceAll() method accepts a string and a regular expression replaces the matched characters with the given string. To remove all the white spaces from an input string, invoke the replaceAll() method on it bypassing the above mentioned regular expression and an empty string as inputs.
In order to use a literal ^ at the start or a literal $ at the end of a regex, the character must be escaped. Some flavors only use ^ and $ as metacharacters when they are at the start or end of the regex respectively. In those flavors, no additional escaping is necessary. It's usually just best to escape them anyway.
Match Whitespace Characters in Python? Yes, the dot regex matches whitespace characters when using Python's re module.
Here's a single regex-replace that works:
\s+(?=([^"]*"[^"]*")*[^"]*$)
which will replace:
(this is a test "sentence for the regex" foo bar)
with:
(thisisatest"sentence for the regex"foobar)
Note that if the quotes can be escaped, the even more verbose regex will do the trick:
\s+(?=((\\[\\"]|[^\\"])*"(\\[\\"]|[^\\"])*")*(\\[\\"]|[^\\"])*$)
which replaces the input:
(this is a test "sentence \"for the regex" foo bar)
with:
(thisisatest"sentence \"for the regex"foobar)
(note that it also works with escaped backspaces: (thisisatest"sentence \\\"for the regex"foobar)
)
Needless to say (?), this really shouldn't be used to perform such a task: it makes ones eyes bleed, and it performs its task in quadratic time, while a simple linear solution exists.
A quick demo:
String text = "(this is a test \"sentence \\\"for the regex\" foo bar)"; String regex = "\\s+(?=((\\\\[\\\\\"]|[^\\\\\"])*\"(\\\\[\\\\\"]|[^\\\\\"])*\")*(\\\\[\\\\\"]|[^\\\\\"])*$)"; System.out.println(text.replaceAll(regex, "")); // output: (thisisatest"sentence \"for the regex"foobar)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With