Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to input a regex in string.replace?

I need some help on declaring a regex. My inputs are like the following:

this is a paragraph with<[1> in between</[1> and then there are cases ... where the<[99> number ranges from 1-100</[99>.  and there are many other lines in the txt files with<[3> such tags </[3> 

The required output is:

this is a paragraph with in between and then there are cases ... where the number ranges from 1-100.  and there are many other lines in the txt files with such tags 

I've tried this:

#!/usr/bin/python import os, sys, re, glob for infile in glob.glob(os.path.join(os.getcwd(), '*.txt')):     for line in reader:          line2 = line.replace('<[1> ', '')         line = line2.replace('</[1> ', '')         line2 = line.replace('<[1>', '')         line = line2.replace('</[1>', '')                  print line 

I've also tried this (but it seems like I'm using the wrong regex syntax):

        line2 = line.replace('<[*> ', '')         line = line2.replace('</[*> ', '')         line2 = line.replace('<[*>', '')         line = line2.replace('</[*>', '') 

I dont want to hard-code the replace from 1 to 99.

like image 550
alvas Avatar asked Apr 14 '11 03:04

alvas


People also ask

How do you replace all occurrences of a regex pattern in a string?

sub() method will replace all pattern occurrences in the target string. By setting the count=1 inside a re. sub() we can replace only the first occurrence of a pattern in the target string with another string.

Can I use regex in replace Python?

Regex can be used to perform various tasks in Python. It is used to do a search and replace operations, replace patterns in text, check if a string contains the specific pattern.

Which function is used to replacing pattern in string?

The REGEXREPLACE( ) function uses a regular expression to find matching patterns in data, and replaces any matching values with a new string.

What replaces one or many matches with a string?

subn() If you want to replace a string that matches a regular expression (regex) instead of perfect match, use the sub() of the re module. In re. sub() , specify a regex pattern in the first argument, a new string in the second, and a string to be processed in the third.


2 Answers

This tested snippet should do it:

import re line = re.sub(r"</?\[\d+>", "", line) 

Edit: Here's a commented version explaining how it works:

line = re.sub(r"""   (?x) # Use free-spacing mode.   <    # Match a literal '<'   /?   # Optionally match a '/'   \[   # Match a literal '['   \d+  # Match one or more digits   >    # Match a literal '>'   """, "", line) 

Regexes are fun! But I would strongly recommend spending an hour or two studying the basics. For starters, you need to learn which characters are special: "metacharacters" which need to be escaped (i.e. with a backslash placed in front - and the rules are different inside and outside character classes.) There is an excellent online tutorial at: www.regular-expressions.info. The time you spend there will pay for itself many times over. Happy regexing!

like image 100
ridgerunner Avatar answered Oct 11 '22 06:10

ridgerunner


str.replace() does fixed replacements. Use re.sub() instead.

like image 41
Ignacio Vazquez-Abrams Avatar answered Oct 11 '22 06:10

Ignacio Vazquez-Abrams