Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to split a string on delimiters including colon(:) except when it involves time?

I am trying to split the string below on a number of delimiters including \n, comma(,), and colon(:) except when the colon is part of a time value. Below is my string:

values = 'City:hell\nCountry:rome\nUpdate date: 2022-09-26 00:00:00'

I have tried:

result = re.split(':|,|\n', values)

However, this ends up splitting the time resulting in `

['City','hell','Country','rome','Update date',' 2022-09-26 00','00','00']

Whereas the expected outcome is

['City','hell','Country','rome','Update date', '2022-09-26 00:00:00']

Any help/assistance will be appreciated

like image 670
Mustapha Unubi Momoh Avatar asked Sep 06 '25 03:09

Mustapha Unubi Momoh


2 Answers

You could use look-behind to ensure that what is before : is not a pair of digits

re.split('(?<![0-9]{2}):\s*|,|\n', values)

It separates by

  • colons with optional spaces when they are not preceded by digits
  • ,
  • \n

So : is a separator (when not preceded by a pair of digits). But so is : or : (still, when they are not preceded by a pair of digits). Consequence is that if, as it is the case if your string, there is a space after a colon, then that space is not included in the next field (since it is part of the separator, not of a field)

Or, you could also keep the first version of my answer (without \s*) and just .strip() the fields.

like image 149
chrslg Avatar answered Sep 07 '25 16:09

chrslg


Solution without re:

values = "City:hell\nCountry:rome\nUpdate date: 2022-09-26 00:00:00"

out = [
    v.strip()
    for l in (line.split(":", maxsplit=1) for line in values.splitlines())
    for v in l
]
print(out)

Prints:

['City', 'hell', 'Country', 'rome', 'Update date', '2022-09-26 00:00:00']
like image 35
Andrej Kesely Avatar answered Sep 07 '25 16:09

Andrej Kesely