Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Regular expression must strip whitespace except between quotes

Tags:

python

regex

I need a way to remove all whitespace from a string, except when that whitespace is between quotes.

result = re.sub('".*?"', "", content)

This will match anything between quotes, but now it needs to ignore that match and add matches for whitespace..

like image 675
Oli Avatar asked Aug 31 '10 13:08

Oli


1 Answers

I don't think you're going to be able to do that with a single regex. One way to do it is to split the string on quotes, apply the whitespace-stripping regex to every other item of the resulting list, and then re-join the list.

import re

def stripwhite(text):
    lst = text.split('"')
    for i, item in enumerate(lst):
        if not i % 2:
            lst[i] = re.sub("\s+", "", item)
    return '"'.join(lst)

print stripwhite('This is a string with some "text in quotes."')
like image 86
kindall Avatar answered Nov 14 '22 22:11

kindall