Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reorder string using regular expressions

I want to bring the first occurrence of a date or in general a regular expression to the beginning of my text:

Example: "I went out on 1 sep 2012 and it was better than 15 jan 2012" and I want to get "1 sep 2012, I went out on and it was better than 15 jan 2012"

I was thinking about replacing "1 sep 2012" with ",1 sep 2012," and then cutting the string from "," but I don't know what to write instead of replace_with:

line = re.sub(r'\d+\s(?:jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\s\d{4}', 'replace_with', line, 1)

any help?

like image 535
Mor Brb Avatar asked Jan 04 '13 08:01

Mor Brb


1 Answers

Use capture groups:

>>> import re
>>> s = "I went out on 1 sep 2012 and it was better than 15 jan 2012"
>>> r = re.compile('(^.*)(1 sep 2012 )(.*$)')
>>> r.sub(r'\2\1\3',s)
'1 sep 2012 I went out on and it was better than 15 jan 2012'

Brackets capture parts of the string:

(^.*)          # Capture everything from the start of the string
(1 sep 2012 )  # Upto the part we are interested in (captured)
(.*$)          # Capture everything else

Then just reorder the capture groups in the substitution `\2\1\3' note: to reference the capture groups requires a raw string r'\2\1\3'. The second group in my example is just the literal string (1 sep 2012 ) but of course this can be any regexp such as the one you created (with an extra \s on the end):

(\d+\s(?:jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\s\d{4}\s)

>>> r = re.compile(r'(^.*)(\d+\s(?:aug|sep|oct|nov)\s\d{4}\s)(.*$)')
>>> r.sub(r'\2\1\3',s)
'1 sep 2012 I went out on and it was better than 15 jan 2012'

From docs.python.org:

When an 'r' or 'R' prefix is present, a character following a backslash is included in the string without change.

like image 194
Chris Seymour Avatar answered Oct 19 '22 01:10

Chris Seymour