Convert string from xmlcharrefreplace back to utf-8

Tags:

I've next part of code:

In [8]: st = u"опа"

In [11]: st.encode("ascii", "xmlcharrefreplace")
Out[11]: '&#1086;&#1087;&#1072;'

In [14]: st1 = st.encode("ascii", "xmlcharrefreplace")

In [15]: st1.decode("ascii", "xmlcharrefreplace")
Out[15]: u'&#1086;&#1087;&#1072;'

In [16]: st1.decode("utf-8", "xmlcharrefreplace")
Out[16]: u'&#1086;&#1087;&#1072;'

Do you have any idea how to convert st1 back to u"опа"?

946

asked Jun 27 '13 11:06

Tural Gurbanov

1 Answers

Use the html.unescape() function (Python 3.4 and newer):

>>> import html
>>> html.unescape('&#1086;&#1087;&#1072;')
'опа'

On older versions (including Python 2), you’d have to use an instance of HTMLParser.HTMLParser():

>>> from HTMLParser import HTMLParser
>>> parser = HTMLParser()
>>> parser.unescape('&#1086;&#1087;&#1072;')
u'\u043e\u043f\u0430'
>>> print parser.unescape('&#1086;&#1087;&#1072;')
опа

100

answered Sep 30 '22 05:09

Martijn Pieters

Related questions
                            
                                Python - Group by and sum a list of tuples
                            
                                Data structure to implement a dictionary with multiple indexes?
                            
                                Change operator precedence in Python
                            
                                Start IDLE with python 3 on Linux (python 2.7 installed alongside)
                            
                                Django MySQL distinct query for getting multiple values
                            
                                Can't get sphinx to link under toctree to another document
                            
                                Quickly applying string operations in a pandas DataFrame
                            
                                How do I constrain my python script to only accepting one argument? (argparse)
                            
                                How to display html using QWebView. Python?
                            
                                Getting Github individual file contributors
                            
                                How to pass pointer back in ctypes?
                            
                                Why is Python's round so strange?
                            
                                Creating multiple objects with foreign key
                            
                                Encode MIMEText as quoted printables
                            
                                How to ignore hidden files in python functions? [duplicate]
                            
                                PhantomJS 1.8 with Selenium on python. How to block images?
                            
                                Speed up solving a triangular linear system with numpy?
                            
                                Simple Python UDP Server: trouble receiving packets from clients other than localhost
                            
                                pyqt - how to make a textarea to write messages to - kinda like printing to a console [closed]
                            
                                How to create a list of fields in django forms

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Convert string from xmlcharrefreplace back to utf-8

Tags:

python

encode

utf-8

unicode-string

Tural Gurbanov

People also ask

1 Answers

Martijn Pieters

Recent Activity

Donate For Us