How to remove tags from a string in python using regular expressions? (NOT in HTML)

Tags:

I need to remove tags from a string in python.

<FNT name="Century Schoolbook" size="22">Title</FNT>

What is the most efficient way to remove the entire tag on both ends, leaving only "Title"? I've only seen ways to do this with HTML tags, and that hasn't worked for me in python. I'm using this particularly for ArcMap, a GIS program. It has it's own tags for its layout elements, and I just need to remove the tags for two specific title text elements. I believe regular expressions should work fine for this, but I'm open to any other suggestions.

616

asked Sep 07 '10 19:09

Tanner Semerad

2 Answers

Please avoid using regex. Eventhough regex will work on your simple string, but you'd get problem in the future if you get a complex one.

You can use BeautifulSoup get_text() feature.

from bs4 import BeautifulSoup

text = '<FNT name="Century Schoolbook" size="22">Title</FNT>'
soup = BeautifulSoup(text)

print(soup.get_text())

answered Sep 20 '22 12:09

Aminah Nuraini

This should work:

import re
re.sub('<[^>]*>', '', mystring)

To everyone saying that regexes are not the correct tool for the job:

The context of the problem is such that all the objections regarding regular/context-free languages are invalid. His language essentially consists of three entities: a = <, b = >, and c = [^><]+. He wants to remove any occurrences of acb. This fairly directly characterizes his problem as one involving a context-free grammar, and it is not much harder to characterize it as a regular one.

I know everyone likes the "you can't parse HTML with regular expressions" answer, but the OP doesn't want to parse it, he just wants to perform a simple transformation.

131

answered Sep 20 '22 12:09

Domenic

Related questions
                            
                                python, "a in b" keyword, how about multiple a's?
                            
                                Deleting files which start with a name Python
                            
                                HTTP basic authentication not working in python 3.4
                            
                                Uses of Python's "from" keyword?
                            
                                Flask - wtforms: Validation always false
                            
                                Why is IronPython faster than the Official Python Interpreter
                            
                                How do I select from multiple tables in one query with Django?
                            
                                Image aspect ratio using Reportlab in Python
                            
                                Does paramiko close ssh connection on a non-paramiko exception
                            
                                How to manage division of huge numbers in Python?
                            
                                SSLError installing with pip
                            
                                Best way to iterate through all rows in a DB-table
                            
                                Cannot import name 'MappingProxyType' error after importing functools
                            
                                Append the same value multiple times to a list [duplicate]
                            
                                pip install dependency links
                            
                                Remove Part of String Before the Last Forward Slash
                            
                                Can the execution of statements in Python be delayed?
                            
                                How to kill all instance of uwsgi
                            
                                Tensorflow Assign requires shapes of both tensors to match. lhs shape= [20] rhs shape= [48]
                            
                                pylint no member issue but code still works vscode

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to remove tags from a string in python using regular expressions? (NOT in HTML)

Tags:

python

strip

arcmap

Tanner Semerad

People also ask

2 Answers

Aminah Nuraini

Domenic

Recent Activity

Donate For Us