Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python regex split on repeating character

I have a string for example

--------------------------------
hello world !
--------------------------------
world hello !
--------------------------------
! hello world

and I want to be able to split the lines on the hyphens, the hyphens could be of variable length which is why I decided to use regex, the information I want to extract out of this is ['hello world !', 'world hello !', '! hello world'] I have tried splitting the string using static number of hyphens, this works but not sure how to go about it if it was of variable length. I have tried doing:

re.split(r'\-{3,}', str1)

however that did not seem to work

like image 472
Johnathon64 Avatar asked Mar 03 '26 06:03

Johnathon64


1 Answers

You may strip the unnecessary whitespace from the input and resulting split chunks with a .strip() method:

import re
p = re.compile(r'(?m)^-{3,}$')
t = "--------------------------------\nhello world !\n--------------------------------\nworld hello !\n--------------------------------\n! hello world"
result = [x.strip() for x in p.split(t.strip("-\n\r"))]
print(result)

As for the regex, I suggest limiting to the hyphen-only lines with (?m)^-{3,}$ that matches 3 or more hyphens between the start of line (^) and end of line ($) (due to (?m), these anchors match the line boundaries, not the string boundaries).

See the IDEONE demo

like image 100
Wiktor Stribiżew Avatar answered Mar 05 '26 02:03

Wiktor Stribiżew



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!