Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

using regular expression to split string with multiple spaces.

I'm trying to split a string that is delimited by multiple spaces i.e:

    string1 = "abcd    efgh   a. abcd   b efgh"
    print re.findall(r"[\w.]+") 

as expected, the results are:

    ['abcd', 'efgh', 'a.', 'abcd', 'b', 'efgh']

However, I would like to group 'a.' and 'abcd' into the same group, and 'b' and 'efgh' into the same group. So the result I want would look something like:

    ['abcd', 'efgh', 'a. abcd', 'b efgh']

My approach at the moment is to create two types of expression. The first to deal with the regular expression without the space i.e. 'abcd' and 'efgh'. The second to deal with the ones with a single space. i.e. 'a.' + 'abcd'.

So if r'[\w]+ can deal with the first type, and r'[\w]+ [\w]+ can deal with the second type. But I don't know how to combine them into the same expression using '|'.

As always, any other approaches are welcome. And thanks for your time!

like image 395
jshen Avatar asked Dec 14 '25 12:12

jshen


1 Answers

result = [s.strip() for s in string1.split('  ') if s.strip()]

i.e. splitting on two spaces and removing extraneous spaces from the result (using strip).

like image 151
James Little Avatar answered Dec 19 '25 07:12

James Little



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!