Split string between characters with Python regex

Question

I'm trying to split the string:

> s = Ladegårdsvej 8B7100 Vejle

with a regex into:

[street,zip,city] = ["Ladegårdsvej 8B", "7100", "Vejle"]

s varies a lot, the only certain part is that there are always 4 digits in the zip and a whitespace afterwards. My idea is thus to "match from the right" on 4 digits and a whitespace to indicate that the string should be split at that point in the string.

Currently I'm able to get street and city like this:

> print re.split(re.compile(r"[0-9]{4}\s"), s)
["Ladegårdsvej 8B", "Vejle"]

How would I go about splitting s as desired; in particular, how to do it in the middle of the string between the number in street and zip?

tobias_k · Accepted Answer

You can use re.split, but make the four digits a capturing group:

>>> s = "Ladegårdsvej 8B7100 Vejle"
>>> re.split(r"(\d{4}) ", s)
['Ladegårdsvej 8B', '7100', 'Vejle']

From the documentation (emphasis mine)

Split string by the occurrences of pattern. If capturing parentheses are used in pattern, then the text of all groups in the pattern are also returned as part of the resulting list. If maxsplit is nonzero, at most maxsplit splits occur, and the remainder of the string is returned as the final element of the list.

Split string between characters with Python regex

Tags:

python

string

regex

split

RasmusP_963

1 Answers

tobias_k

Recent Activity

Donate For Us

Split string between characters with Python regex

Tags:

python

string

regex

split

RasmusP_963

1 Answers

tobias_k

Related questions

Recent Activity

Donate For Us