I am using Beautiful Soup to parse webpages and printing the name of the webpages visited on the terminal. However, often the name of the webpage has single left (\u2018) and right(\u2019) character which the python can't print as it gives charmap encoding error. Is there any way to remove these characters?

These codes are Unicode for the single left and right quote characters. You can replace them with their ASCII equivalent which Python shouldn't have any problem printing on your system: <pre class="prettyprint"><code>>>> print u"\u2018Hi\u2019" ‘Hi’ >>> print u"\u2018Hi\u2019".replace(u"\u2018", "'").replace(u"\u2019", "'") 'Hi' </code></pre> Alternatively with regex: <pre class="prettyprint"><code>import re s = u"\u2018Hi\u2019" >>> print re.sub(u"(\u2018|\u2019)", "'", s) 'Hi' </code></pre> However Python shouldn't have any problem printing the Unicode version of these as well. It's possible that you are using <code>str()</code> somewhere which will try to convert your unicode to ascii and throw your exception.

Removing \u2018 and \u2019 character

Tags:

python

I am using Beautiful Soup to parse webpages and printing the name of the webpages visited on the terminal. However, often the name of the webpage has single left (\u2018) and right(\u2019) character which the python can't print as it gives charmap encoding error. Is there any way to remove these characters?

867

asked Jun 23 '14 04:06

bhavesh

1 Answers

These codes are Unicode for the single left and right quote characters. You can replace them with their ASCII equivalent which Python shouldn't have any problem printing on your system:

>>> print u"\u2018Hi\u2019" ‘Hi’ >>> print u"\u2018Hi\u2019".replace(u"\u2018", "'").replace(u"\u2019", "'") 'Hi'

Alternatively with regex:

import re s = u"\u2018Hi\u2019" >>> print re.sub(u"(\u2018|\u2019)", "'", s) 'Hi'

However Python shouldn't have any problem printing the Unicode version of these as well. It's possible that you are using str() somewhere which will try to convert your unicode to ascii and throw your exception.

120

answered Sep 21 '22 22:09

14 revs, 12 users 16%

Related questions
                            
                                What is the difference between Spyder and Jupyter?
                            
                                Determine if Python variable is an instance of a built-in type
                            
                                Can I use a multiprocessing Queue in a function called by Pool.imap?
                            
                                Setup.py: install lxml with Python2.6 on CentOS
                            
                                PyCharm: Forcing Django Template Syntax Highligting
                            
                                How do I let my matplotlib plot go beyond the axes?
                            
                                AttributeError: 'Manager' object has no attribute 'get_by_natural_key' error in Django?
                            
                                What's the working directory when using IDLE?
                            
                                Python `map` and arguments unpacking
                            
                                python requests - POST Multipart/form-data without filename in HTTP request
                            
                                pyzmq missing when running ipython notebook
                            
                                Tensor with unspecified dimension in tensorflow
                            
                                Is it possible to use argparse to capture an arbitrary set of optional arguments?
                            
                                Numpy is installed but still getting error
                            
                                Way to have compiled python files in a separate folder?
                            
                                How to check if a variable is empty in python?
                            
                                Installing a django site on GoDaddy [closed]
                            
                                will using list comprehension to read a file automagically call close()
                            
                                Django Rest Framework - How to test ViewSet?
                            
                                Does enumerate() produce a generator object?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With