Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to replace pairs of tokens in a string?

New to python, competent in a few languages, but can't see a 'snazzy' way of doing the following. I'm sure it's screaming out for a regex, but any solution I can come up with (using regex groups and what not) becomes insane quite quickly.

So, I have a string with html-like tags that I want to replace with actual html tags.

For example:

Hello, my name is /bJane/b.

Should become:

Hello, my name is <b>Jane</b>.

It might be combo'd with [i]talic and [u]nderline as well:

/iHello/i, my /uname/u is /b/i/uJane/b/i/u.

Should become:

<i>Hello</i>, my <u>name</u> is <b><i><u>Jane</b></i></u>.

Obviously a straight str.replace won't work because every 2nd token needs to be preceeded with the forwardslash.

For clarity, if tokens are being combo'd, it's always first opened, first closed.

Many thanks!

PS: Before anybody gets excited, I know that this sort of thing should be done with CSS, blah, blah, blah, but I didn't write the software, I'm just reversing its output!

like image 335
Bridgey Avatar asked Mar 16 '11 20:03

Bridgey


1 Answers

Maybe something like this can help :

import re


def text2html(text):
    """ Convert a text in a certain format to html.

    Examples:
    >>> text2html('Hello, my name is /bJane/b')
    'Hello, my name is <b>Jane</b>'
    >>> text2html('/iHello/i, my /uname/u is /b/i/uJane/u/i/b')
    '<i>Hello</i>, my <u>name</u> is <b><i><u>Jane</u></i></b>'

    """

    elem = []

    def to_tag(match_obj):
        match = match_obj.group(0)
        if match in elem:
            elem.pop(elem.index(match))
            return "</{0}>".format(match[1])
        else:
            elem.append(match)
            return "<{0}>".format(match[1])

    return re.sub(r'/.', to_tag, text)

if __name__ == "__main__":
    import doctest
    doctest.testmod()
like image 109
mouad Avatar answered Oct 13 '22 08:10

mouad