Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python complicated regex string expansion [duplicate]

Suppose I have a string of the following form:

ABCDEF_(0-100;1)(A|B)_GHIJ_(A-F)

I want to be able to expand this to:

ABCDEF_0A_GHIJ_A
ABCDEF_1A_GHIJ_A
ABCDEF_2A_GHIJ_A
...
ABCDEF_100A_GHIJ_A

ABCDEF_0B_GHIJ_A
ABCDEF_1B_GHIJ_A
ABCDEF_2B_GHIJ_A
...
ABCDEF_100B_GHIJ_A

ABCDEF_0A_GHIJ_B
ABCDEF_1A_GHIJ_B
ABCDEF_2A_GHIJ_B
...
ABCDEF_100A_GHIJ_B

ABCDEF_0B_GHIJ_B
ABCDEF_1B_GHIJ_B
ABCDEF_2B_GHIJ_B
...
ABCDEF_100B_GHIJ_B

ABCDEF_0A_GHIJ_C
ABCDEF_1A_GHIJ_C
ABCDEF_2A_GHIJ_C
...
ABCDEF_100A_GHIJ_C

..and so on

The string on the second line is short hand for:

STRING_(START-END;INC)_STRING(A OR B)_STRING(A THRU F)

However, the regex notations can be ANYWHERE in the string. i.e. the string could also be :

ABCDEF_(A|B)_(0-100;1)_(A-F)_GHIJ

Here's what I tried so far:

trend = 'ABCDEF_(0-100;1)(A|B)_GHIJ_(A-F)'

def expandDash(trend):
    dashCount = trend.count("-")
    for dC in range(0, dashCount):
        dashIndex = trend.index("-")-1
        trendRange = trend[dashIndex:]
        bareTrend = trend[0:trend.index("(")]
        beginRange = trendRange[0:trendRange.index("-")]
        endRange = trendRange[trendRange.index("-"):trendRange.index(";")]
        trendIncrement = trendRange[-1]
        expandedTrendList = []


def regexExpand(trend):

    for regexTrend in trend.split(')'):
        if "-" in regexTrend:
            print trend
            expandDash(regexTrend)

I'm obviously stuck here...

Is there any easy way to do the string expansion using REGEX?

like image 690
Mark Kennedy Avatar asked Feb 28 '26 18:02

Mark Kennedy


1 Answers

You could parse your mini-expression language fairly easily using regex. But you can't use regex to actually do the expansion:

TREND_REGEX = re.compile('(^.*?)(?:\((?:([^-)])-([^)])|(\d+)-(\d+);(\d+)|([^)|]+(?:\|[^)|]+)*))\)(.*))?$')

def expand(trend):
    m = TREND_REGEX.match(trend)
    if m.group(8):
        suffixes = expand(m.group(8))
    else:
        suffixes = ['']
    if m.group(2):
        for z in suffixes:
            for i in range(ord(m.group(2)), ord(m.group(3))+1):
                yield m.group(1) + chr(i) + z
    elif m.group(4):
        for z in suffixes:
            for i in range(int(m.group(4)), int(m.group(5))+1, int(m.group(6))):
                yield m.group(1) + str(i) + z
    elif m.group(7):
        for z in suffixes:
            for s in m.group(7).split('|'):
                yield m.group(1) + s + z
    else:
        yield trend
like image 70
pobrelkey Avatar answered Mar 03 '26 07:03

pobrelkey



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!