Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular expression for a JIRA identifier

Tags:

I'm trying to extract a JIRA identifier from a line of text.

JIRA identifiers are of the form [A-Z]+-[0-9] - I have the following pattern:

foreach my $line ( @textBlock ) {
    my ( $id ) = ( $line =~ /[\s|]?([A-Z]+-[0-9]+)[\s:|]?/ );
    push @jiraIDs, $id if ( defined $id && $id !~ /^$/ );
}

This doesn't cope if someone specifies some text which contains the pattern inside another string - for example blah_blah_ABC-123 would match upon ABC-123. I don't want to mandate that there must be a space or other delimiter in front of the match as that would fail if the identifier were at the start of the line.

Can anyone suggest the necessary runes?

Thanks.

like image 461
DaveG Avatar asked Oct 11 '13 16:10

DaveG


2 Answers

Official JIRA ID Regex (Java):

Atlassian themselves have a couple webpages floating around that suggest a good (java) regex is this:

((?<!([A-Z]{1,10})-?)[A-Z]+-\d+)

(Source: https://confluence.atlassian.com/display/STASHKB/Integrating+with+custom+JIRA+issue+key)

Test String:
"BF-18 abc-123 X-88 ABCDEFGHIJKL-999 abc XY-Z-333 abcDEF-33 ABC-1"

Matches:
BF-18, X-88, ABCDEFGHIJKL-999, DEF-33, ABC-1

Improved JIRA ID Regex (Java):

But, I don't really like it because it will match the "DEF-33" from "abcDEF-33", whereas I prefer to ignore "abcDEF-33" altogether. So in my own code I'm using:

((?<!([A-Za-z]{1,10})-?)[A-Z]+-\d+)

Notice how "DEF-33" is no longer matched:

Test String:
"BF-18 abc-123 X-88 ABCDEFGHIJKL-999 abc XY-Z-333 abcDEF-33 ABC-1"

Matches:
BF-18, X-88, ABCDEFGHIJKL-999, ABC-1

Improved JIRA ID Regex (JavaScript):

I also needed this regex in JavaScript. Unfortunately, JavaScript does not support the LookBehind (?<!a)b, and so I had to port it to LookAhead a(?!b) and reverse everything:

var jira_matcher = /\d+-[A-Z]+(?!-?[a-zA-Z]{1,10})/g

This means the string to be matched needs to be reversed ahead of time, too:

var s = "BF-18 abc-123 X-88 ABCDEFGHIJKL-999 abc XY-Z-333 abcDEF-33 ABC-1"
s = reverse(s)
var m = s.match(jira_matcher);

// Also need to reverse all the results!
for (var i = 0; i < m.length; i++) {
    m[i] = reverse(m[i])
}
m.reverse()
console.log(m)

// Output:
[ 'BF-18', 'X-88', 'ABCDEFGHIJKL-999', 'ABC-1' ]
like image 188
Julius Musseau Avatar answered Oct 05 '22 23:10

Julius Musseau


You can make sure that character before your pattern is either a whitespace, or the beginning of the string using alternation. Similarly make sure, it is followed by either whitespace or end of the string.

You can use this regex:

my ( $id ) = ( $line =~ /(?:\s|^)([A-Z]+-[0-9]+)(?=\s|$)/ );
like image 26
Rohit Jain Avatar answered Oct 05 '22 23:10

Rohit Jain