I'm trying to scrape a website, but it gives me an error.
I'm using the following code:
import urllib.request from bs4 import BeautifulSoup get = urllib.request.urlopen("https://www.website.com/") html = get.read() soup = BeautifulSoup(html) print(soup)
And I'm getting the following error:
File "C:\Python34\lib\encodings\cp1252.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_table)[0] UnicodeEncodeError: 'charmap' codec can't encode characters in position 70924-70950: character maps to <undefined>
What can I do to fix this?
To fix UnicodeEncodeError: 'charmap' codec can't encode characters with Python, we can set the encodings argument when we open the file. to call open with the fname file name path and the encoding argument set to utf-8 to open the file at fname as a Unicode encoded file.
The Python "UnicodeEncodeError: 'charmap' codec can't encode characters in position" occurs when we use an incorrect codec to encode a string to bytes. To solve the error, specify the correct encoding when opening the file or encoding the string, e.g. utf-8 .
Only a limited number of Unicode characters are mapped to strings. Thus, any character that is not-represented / mapped will cause the encoding to fail and raise UnicodeEncodeError. To avoid this error use the encode( utf-8 ) and decode( utf-8 ) functions accordingly in your code.
I was getting the same UnicodeEncodeError
when saving scraped web content to a file. To fix it I replaced this code:
with open(fname, "w") as f: f.write(html)
with this:
with open(fname, "w", encoding="utf-8") as f: f.write(html)
If you need to support Python 2, then use this:
import io with io.open(fname, "w", encoding="utf-8") as f: f.write(html)
If your file is encoded in something other than UTF-8, specify whatever your actual encoding is for encoding
.
I fixed it by adding .encode("utf-8")
to soup
.
That means that print(soup)
becomes print(soup.encode("utf-8"))
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With