Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Android optional word boundary regex

I'm having trouble with a regular expression when targeting the Android platform 2.2.3.

The following regular expression works when targeting the Java VM on my desktop and the regular expression is also working on a .NET application too.

Pattern.compile("\\b?")

But when I target my phone I get a PatternSyntaxException. Any ideas?

like image 776
Tentux Avatar asked Oct 03 '12 22:10

Tentux


People also ask

Is a word boundary in regex?

Introduction to the Python regex word boundaryBefore the first character in the string if the first character is a word character ( \w ). Between two characters in the string if the first character is a word character ( \w ) and the other is not ( \W – inverse character set of the word character \w ).

Which sequence is useful to indicate word boundary in regex?

The following three positions are qualified as word boundaries: Before the first character in a string if the first character is a word character. After the last character in a string if the last character is a word character. Between two characters in a string if one is a word character and the other is not.

What is word boundary in regex Java?

The regular expression token "\b" is called a word boundary. It matches at the start or the end of a word. By itself, it results in a zero-length match.

What is word boundary \B?

A word boundary \b is a test, just like ^ and $ . When the regexp engine (program module that implements searching for regexps) comes across \b , it checks that the position in the string is a word boundary.


1 Answers

I can confirm that this does throw a PatternSyntaxException when running in the Android emulator, but not in a regular Java application. I can't see why that would be the case, other than the fact that regular expression implementation used in Android is different than in the normal Java SDK. From the Pattern Android Developers page:

The regular expression implementation used in Android is provided by ICU. The notation for the regular expressions is mostly a superset of those used in other Java language implementations. This means that existing applications will normally work as expected, but in rare cases Android may accept a regular expression that is not accepted by other implementations.

As a work-around, I did discover that you can get around the exception by enclosing the word boundary assertion in a non-capturing group.

Pattern.compile("(?:\\b)?");

(A capturing group works as well, but I doubt you need it.)

I suggest you report this as a bug to see if you can get an official response. (I already searched, and it doesn't appear to be reported yet.)

like image 69
Bill the Lizard Avatar answered Oct 05 '22 23:10

Bill the Lizard