Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java regular expression match

Tags:

java

regex

I need to match when a string begins with number, then a dot follows, then one space and 1 or more upper case characters. The match must occur at the beginning of the string. I have the following string.

1. PTYU fmmflksfkslfsm

The regular expression that I tried with is:

^\d+[.]\s{1}[A-Z]+

And it does not match. What would a working regular expression be for this problem?

like image 364
user152508 Avatar asked Dec 16 '10 17:12

user152508


People also ask

How do you match a regular expression?

To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ ). E.g., \. matches "." ; regex \+ matches "+" ; and regex \( matches "(" . You also need to use regex \\ to match "\" (back-slash).

Is there pattern matching in Java?

Pattern matching has modified two syntactic elements of the Java language: the instanceof keyword and switch statements. They were both extended with a special kind of patterns called type patterns. There is more to come in the near future.

What is regex pattern in Java?

Regular Expressions or Regex (in short) in Java is an API for defining String patterns that can be used for searching, manipulating, and editing a string in Java. Email validation and passwords are a few areas of strings where Regex is widely used to define the constraints.


2 Answers

(Sorry for my earlier error. Brain now firmly engaged. Er, probably.)

This works:

String rex = "^\\d+\\.\\s\\p{Lu}+.*";

System.out.println("1. PTYU fmmflksfkslfsm".matches(rex));
// true

System.out.println(". PTYU fmmflksfkslfsm".matches(rex));
// false, missing leading digit

System.out.println("1.PTYU fmmflksfkslfsm".matches(rex));
// false, missing space after .

System.out.println("1. xPTYU fmmflksfkslfsm".matches(rex));
// false, lower case letter before the upper case letters

Breaking it down:

  • ^ = Start of string
  • \d+ = One or more digits (the \ is escaped because it's in a string, hence \\)
  • \. = A literal . (or your original [.] is fine) (again, escaped in the string)
  • \s = One whitespace char (no need for the {1} after it) (I'll stop mentioning the escapes now)
  • \p{Lu}+ = One or more upper case letters (using the proper Unicode escape — thank you, tchrist, for pointing this out in your comment below. In English terms, the equivalent would be [A-Z]+)
  • .* = Anything else

See the documentation here for details.

You only need the .* at the end if you're using a method like String#match (above) that will try to match the entire string.

like image 84
T.J. Crowder Avatar answered Oct 07 '22 01:10

T.J. Crowder


It depends which method are you using. I think it will work if you use Matcher.find(). It will not work if you are using Matcher.matches() because match works on whole line. If you are using matches() fix your pattern as following:

^\d+\.\s{1}[A-Z]+.*

(pay attention on trailing .*)

And I'd also use \. instead of [.]. It is more readable.

like image 39
AlexR Avatar answered Oct 07 '22 01:10

AlexR