I am using Selenium and python to scrape a website. I am scraping some '£' Characters, however I am getting this instead: <code>\u00a3</code>, when writing to JSON (they appear as '£' with I print them to terminal). I understand they are Unicode and I need them in UTF8 (?). I've tried a few things I've found on SO and haven't had much success. I have tried .replace (.replace('\u00a3', '£') - However I'm not having much success. How do I get the characters to look like '£' instead of <code>\u00a3</code>? This is the line that's printing incorrectly. Let me know if you want to see my entire code. <pre class="prettyprint"><code>price = page.find_element_by_class_name('header_tags').text </code></pre>

If you're using <code>json.dump()</code> or <code>json.dumps()</code>, try setting <code>ensure_ascii=False</code>

you can encode the string like below <pre class="prettyprint"><code>s = 'This is a Pound sign \u00a3' s.encode('utf8') print(s) </code></pre> Output <code>This is a Pound sign £</code>

You need to call <code>text("utf-8")</code> while printing as follows: <pre class="prettyprint"><code>print(page.find_element_by_class_name('header_tags').text("utf-8")) </code></pre> But this issue can occur at some lines as well. So as per best practices start the Python file with the line: <pre class="prettyprint"><code># -*- coding: UTF-8 -*- </code></pre> An example: <pre class="prettyprint"><code>from selenium import webdriver # other lines of code price = page.find_element_by_class_name('header_tags').text </code></pre>

Writing to JSON - Converting \u00a3 to £

Tags:

python

json

utf-8

selenium

selenium-webdriver

I am using Selenium and python to scrape a website. I am scraping some '£' Characters, however I am getting this instead: \u00a3, when writing to JSON (they appear as '£' with I print them to terminal).

I understand they are Unicode and I need them in UTF8 (?). I've tried a few things I've found on SO and haven't had much success.

I have tried .replace (.replace('\u00a3', '£') - However I'm not having much success.

How do I get the characters to look like '£' instead of \u00a3?

This is the line that's printing incorrectly. Let me know if you want to see my entire code.

Click to copy

price = page.find_element_by_class_name('header_tags').text

242

asked Oct 21 '18 12:10

James5949

3 Answers

If you're using json.dump() or json.dumps(), try setting ensure_ascii=False

answered Sep 28 '22 08:09

Vikrant Sharma

you can encode the string like below

Click to copy

s = 'This is a Pound sign \u00a3'
s.encode('utf8')
print(s)

Output

This is a Pound sign £

answered Sep 28 '22 06:09

ansu5555

You need to call text("utf-8") while printing as follows:

Click to copy

print(page.find_element_by_class_name('header_tags').text("utf-8"))

But this issue can occur at some lines as well. So as per best practices start the Python file with the line:

Click to copy

# -*- coding: UTF-8 -*-

An example:

Click to copy

from selenium import webdriver
# other lines of code
price = page.find_element_by_class_name('header_tags').text

answered Sep 28 '22 08:09

undetected Selenium

Related questions
                            
                                SOAP API with Python
                            
                                using a python generator to process large text files
                            
                                matplotlib scatter: TypeError: unhashable type: 'numpy.ndarray'
                            
                                "ImportError: cannot import name main" after upgrading to pip 10.0.0 for Python version 2.7.12 - Only one version of Python is installed
                            
                                Keras Tensorflow Binary Cross entropy loss greater than 1
                            
                                How do i list folder in directory [duplicate]
                            
                                Modifying dataFrames inside a list is not working
                            
                                ImportError: cannot import name language in Google Cloud Language API
                            
                                Boto3 delete object inside directory
                            
                                Django 2.0 : Application labels aren't unique, duplicates: auth
                            
                                Location of N max values in a python list?
                            
                                Upload image to S3 python
                            
                                Removing rows from dataframe whose first letter is in lowercase
                            
                                Iterate in C++ like in python
                            
                                Postman, Python and passing images and metadata to a web service
                            
                                pyplot bar charts with individual data points
                            
                                Incorrect UTC date in MongoDB Compass
                            
                                conda update anaconda Fails | ClobberError
                            
                                Error Compiling Tensorflow From Source - No module named 'keras_applications'
                            
                                Rolling maximum with numpy

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Writing to JSON - Converting \u00a3 to £

Tags:

python

json

utf-8

selenium

selenium-webdriver

James5949

People also ask

3 Answers

Vikrant Sharma

ansu5555

undetected Selenium

Recent Activity

Donate For Us