Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove characters from beginning and end or only end of line

Tags:

python

regex

I want to remove some symbols from a string using a regular expression, for example:

== (that occur both at the beginning and at the end of a line),

* (at the beginning of a line ONLY).

def some_func():
    clean = re.sub(r'= {2,}', '', clean) #Removes 2 or more occurrences of = at the beg and at the end of a line.
    clean = re.sub(r'^\* {1,}', '', clean) #Removes 1 or more occurrences of * at the beginning of a line.

What's wrong with my code? It seems like expressions are wrong. How do I remove a character/symbol if it's at the beginning or at the end of the line (with one or more occurrences)?

like image 942
Gusto Avatar asked Nov 06 '10 15:11

Gusto


1 Answers

If you only want to remove characters from the beginning and the end, you could use the string.strip() method. This would give some code like this:

>>> s1 = '== foo bar =='
>>> s1.strip('=')
' foo bar '
>>> s2 = '* foo bar'
>>> s2.lstrip('*')
' foo bar'

The strip method removes the characters given in the argument from the beginning and the end of the string, ltrip removes them from only the beginning, and rstrip removes them only from the end.

If you really want to use a regular expression, they would look something like this:

clean = re.sub(r'(^={2,})|(={2,}$)', '', clean)
clean = re.sub(r'^\*+', '', clean)

But IMHO, using strip/lstrip/rstrip would be the most appropriate for what you want to do.

Edit: On Nick's suggestion, here is a solution that would do all this in one line:

clean = clean.lstrip('*').strip('= ')

(A common mistake is to think that these methods remove characters in the order they're given in the argument, in fact, the argument is just a sequence of characters to remove, whatever their order is, that's why the .strip('= ') would remove every '=' and ' ' from the beginning and the end, and not just the string '= '.)

like image 75
mdeous Avatar answered Oct 13 '22 13:10

mdeous