I want to fix malformed ellipses (...
) in a String
.
"Hello.. World.."
"Hello... World..." // this is correct
"Hello.... World...."
"Hello..... World....."
should all be corrected to:
"Hello... World..."
The following regex handles any instance of 3 or more consecutive .
's:
line.replaceAll("\\.{3,}", "...");
However, I don't know how to handle the case when there are exactly 2 consecutive .
's. We cannot do something like this:
line.replaceAll("\\.{2}", "...");
For example, for "..."
, the code above will return "......"
, as the regex will replace the first 2 .
's (index 0 and 1), then the next 2 .
's (index 1 and 2), resulting in "..." + "..." = "......"
.
Something like this works:
line.replaceAll("\\.{2}", "...").replaceAll("\\.{3,}", "...");
...but there must be a better way!
You can replace any group of two or more of .
:
[.]{2,}
with ...
Why not keep it simple?
\.\.+
If you really don't want it to mess with groups of 3 there's this:
\.{4,}|(?<!\.)\.{2}(?!\.)
What this does is this looks for groups larger than 3 first then it looks for groups of 2. The special thing about "..." is there are 2 groups of ".." in "...". So in (?!\.)
you look for the 3rd "." after the first 2. If that 3rd "." exists then discard that result. This is called negative lookahead. To discard the 2nd ".." you have to perform negative lookbehind. So (?<!\.)
looks for that "." before the 2nd ".." and this result is discarded if found.
Negative lookbehind can't be perform by javascript so I used one that uses the Java compiler.
Link: https://www.myregextester.com/?r=d41b2f7e
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With