I need to split a string without removal of delimiter in Python.
Eg:
content = 'This 1 string is very big 2 i need to split it 3 into paragraph wise. 4 But this string 5 not a formated string.'
content = content.split('\s\d\s')
After this I am getting like this:
This\n
string is very big\n
i need to split it\n
into paragraph wise.\n
But this string\n
not a formated string.
but I want like this way:
This\n
1 string is very big\n
2 i need to split it\n
3 into paragraph wise.\n
4 But this string\n
5 not a formated string
Use regex module provided by python.
by re.sub you can find a regex group and replace it with your desired string. \g<0> is used to use the matched group ( in this case the numbers ).
Example:
import re
content = 'This 1 string is very big 2 i need to split it 3 into paragraph wise. 4 But this string 5 not a formated string.'
result = re.sub(r'\s\d\s',r'\n\g<0>',content)
Result would be :
'This\n 1 string is very big\n 2 i need to split it\n 3 into paragraph wise.\n 4 But this string\n 5 not a formated string.'
Here is more in-depth details about re.sub
You could use re.split with forward lookahead:
import re
re.split('\s(?=\d\s)',content)
resulting in:
['This', '1 string is very big', '2 i need to split it', '3 into paragraph wise.', '4 But this string', '5 not a formated string.']
This splits on spaces -- but only those which are immediately followed by a digit then another space.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With