Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

java regex for UUID

Tags:

java

regex

I want to parse a String which has UUID in the below format

"<urn:uuid:4324e9d5-8d1f-442c-96a4-6146640da7ce>"

I have tried it parsing in below way, which works, however I think it would be slow

private static final String reg1 = ".*?";
private static final String reg2 = "([A-Z0-9]{8}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{12})";
private static final Pattern splitter = Pattern.compile(re1 + re2, Pattern.CASE_INSENSITIVE | Pattern.DOTALL);

I am looking for a faster way and tried below, but it fails to match

private static final Pattern URN_UUID_PATTERN = Pattern.compile("^< urn:uuid:([^&])+&gt");

I am new to regex. any help is appreciated.

\Aqura

like image 344
Aqura Avatar asked Jun 03 '16 13:06

Aqura


People also ask

What is regex for UUID?

A UUID is a 128-bit number represented, textually, in 16 octets as 32 hexadecimal (base-16) digits. These 32 digits are displayed in 5 groups separated by hyphens, in the form 8-4-4-4-12, for a total of 36 characters. You may edit the regex to your liking for removing hyphens.

How do you validate a UUID?

Use the . match() method to check whether String is UUID. The given script is not Javascript, which is what the OP asked for.

Is string UUID Java?

The fromString() method of UUID class in Java is used for the creation of UUID from the standard string representation of the same. Parameters: The method takes one parameter UUID_name which is the string representation of the UUID. Return Value: The method returns the actual UUID created from the specified string.


1 Answers

Your example of a faster regex is using a < where the input is &lt; so that's confusing.

Regarding speed, first, your UUID is hexadecimal, so don't match with A-Z but rather a-f. Second you give no indication that case is mixed, so don't use case insensitive and write the correct case in the range.

You don't explain if you need the part preceding the UUID. If not, don't include .*?, and you may as well write the literals for re1 and re2 together in your final Pattern. There's no indication you need DOTALL either.

private static final Pattern splitter =
  Pattern.compile("[a-f0-9]{8}(?:-[a-f0-9]{4}){4}[a-f0-9]{8}");

Alternatively, if you are measuring your Regular Expression's performance to be too slow, you might try another approach, for example:
Is each uuid preceded by "uuid:" as in your example? If so you can

  1. find the first index of "uuid:" as i, then
  2. substring 0 to i+5 [assuming you needed it at all], and
  3. substring i+5 to i+41, if I counted that right (36 characters in length).

Along similar lines your faster regex could be:

private static final Pattern URN_UUID_PATTERN =
    Pattern.compile("^&lt;urn:uuid:(.{36})&gt;");

OTOH if all your input strings are going to start with those exact characters, no need to do step 1 in the previous suggestion, just input.substring(13, 49);

like image 94
dlamblin Avatar answered Oct 22 '22 13:10

dlamblin