Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Simple Java regex not working

I have this regex which is supposed to remove sentence delimiters(. and ?):

sentence = sentence.replaceAll("\\.|\\?$","");

It works fine it converts

"I am Java developer." to "I am Java developer"

"Am I a Java developer?" to "Am I a Java developer"

But after deployment we found that it also replaces any other dots in the sentence as

"Hi.Am I a Java developer?" becomes "HiAm I a Java developer"

Why is this happening?

like image 623
user489849 Avatar asked Oct 28 '10 08:10

user489849


2 Answers

The pipe (|) has the lowest precedence of all operators. So your regex:

\\.|\\?$

is being treated as:

(\\.)|(\\?$)

which matches a . anywhere in the string and matches a ? at the end of the string.

To fix this you need to group the . and ? together as:

(?:\\.|\\?)$

You could also use:

[.?]$

Within a character class . and ? are treated literally so you need not escape them.

like image 101
codaddict Avatar answered Sep 20 '22 14:09

codaddict


What you're saying with "\\.|\\?$" is "either a period" or "a question mark as the last character".

I would recommend "[.?]$" instead in order to avoid the confusing escaping (and undesirable result, of course).

like image 44
jensgram Avatar answered Sep 18 '22 14:09

jensgram