Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replacing a RegEx with a string of characters with the same length

Tags:

python

regex

I want to replace XML tags, with a sequence of repeated characters that has the same number of characters of the tag.

For example:

<o:LastSaved>2013-01-21T21:15:00Z</o:LastSaved>

I want to replace it with:

#############2013-01-21T21:15:00Z##############

How can we use RegEx for this?

like image 753
hmghaly Avatar asked Jan 21 '13 21:01

hmghaly


People also ask

How do you replace all occurrences of a regex pattern in a string?

sub() method will replace all pattern occurrences in the target string. By setting the count=1 inside a re. sub() we can replace only the first occurrence of a pattern in the target string with another string. Set the count value to the number of replacements you want to perform.

Can regex replace characters?

RegEx can be effectively used to recreate patterns. So combining this with . replace means we can replace patterns and not just exact characters.

What is difference [] and () in regex?

[] denotes a character class. () denotes a capturing group. [a-z0-9] -- One character that is in the range of a-z OR 0-9. (a-z0-9) -- Explicit capture of a-z0-9 .

What is $1 in regex replace?

For example, the replacement pattern $1 indicates that the matched substring is to be replaced by the first captured group.


1 Answers

re.sub accepts a function as replacement:

re.sub(pattern, repl, string, count=0, flags=0)

If repl is a function, it is called for every non-overlapping occurrence of pattern. The function takes a single match object argument, and returns the replacement string.

Here's an example:

In [1]: import re

In [2]: def repl(m):
   ...:     return '#' * len(m.group())
   ...: 

In [3]: re.sub(r'<[^<>]*?>', repl,
   ...:     '<o:LastSaved>2013-01-21T21:15:00Z</o:LastSaved>')
Out[3]: '#############2013-01-21T21:15:00Z##############'

The pattern I used may need some polishing, I'm not sure what's the canonical solution to matching XML tags is. But you get the idea.

like image 57
Lev Levitsky Avatar answered Oct 02 '22 01:10

Lev Levitsky