Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular expression to extract text in reverse order until 3rd instance of a character

Tags:

java

regex

I have a string in the format XXXX_YYYY_YYYYYYY_YYYYYYZZZZ

How can I extract the string from backwards, until the thrid _ (underscore) is hit. extracted value: YYYY_YYYYYYY_YYYYYYZZZZ

I tried this ((?:_[^_]*){3})$ and it seem to work with extra _ in the beginning which I can probably remove it in Java.

Is there any way I get get with out the _ in the beginning.

like image 525
DKG Avatar asked Dec 18 '15 08:12

DKG


4 Answers

This one should suit your needs:

[^_]+(?:_[^_]+){2}$

Regular expression visualization

Debuggex Demo

like image 57
sp00m Avatar answered Oct 21 '22 20:10

sp00m


Like this:

        String line = "XXXX_YYYY_YYYYYYY_YYYYYYZZZZ";

        Pattern p = Pattern.compile("([^_]+(?:_[^_]*){2})$");
        Matcher m = p.matcher(line);
        if(m.find()) {
            System.out.println(m.group(1));
        }

Simply split your "three-times" {3} into one instance without _ and two that need it.

like image 44
Jan Avatar answered Oct 21 '22 21:10

Jan


A non-regex approach is also possible:

String s = "XXXX_YYYY_YYYYYYY_YYYYYYZZZZ";
List r = Arrays.asList(s.split("_"));       // Split by _ and get a List
r = r.subList(Math.max(r.size() - 3, 0), r.size()); // Grab last 3 elements
System.out.println(String.join("_", r));    // Join them with _
// => YYYY_YYYYYYY_YYYYYYZZZZ

See IDEONE demo

In case there are less than 3 elements after splitting, just the remaining ones will get joined (i.e. XX_YYY will turn into XX_YYY).

like image 45
Wiktor Stribiżew Avatar answered Oct 21 '22 22:10

Wiktor Stribiżew


If you reverse the string first, then you can get away with a very simple regex of (.*)(_.*):

String input = "XXXX_YYYY_YYYYYYY_YYYYYYZZZZ";
input = new StringBuilder(input).reverse().toString().replaceAll("(.*)(_.*)", "$1");
input = new StringBuilder(input).reverse().toString();
System.out.println(input);

Output:

YYYY_YYYYYYY_YYYYYYZZZZ
like image 1
Tim Biegeleisen Avatar answered Oct 21 '22 21:10

Tim Biegeleisen