I am learning Beautiful Soup in Python.
I am trying to parse a simple webpage with list of books.
E.g
<a href="https://www.nostarch.com/carhacking">The Car Hacker’s Handbook</a>
I use the below code.
import requests, bs4
res = requests.get('http://nostarch.com')
res.raise_for_status()
nSoup = bs4.BeautifulSoup(res.text,"html.parser")
elems = nSoup.select('.product-body a')
#elems[0] gives
<a href="https://www.nostarch.com/carhacking">The Car Hacker\u2019s Handbook</a>
And
#elems[0].getText() gives
u'The Car Hacker\u2019s Handbook'
But I want the proper text which is given by,
s = elems[0].getText()
print s
>>>The Car Hacker’s Handbook
How to modify my code in order to give "The Car Hacker’s Handbook" output instead of "u'The Car Hacker\u2019s Handbook'" ?
Kindly help.
Go to Format > Font > Font. + D to open the Font dialog box. Select the font and size you want to use.
To set your font as the default for a given block of characters, choose Edit > Preferences > Fonts. Then for each encoding you are likely to use, pick the appropriate fonts for the Variable Width and Fixed Width fonts.
How to decrypt a text with a Unicode cipher? In order make the translation of a Unicode message, reassociate each identifier code its Unicode character. Example: The message 68,67,934,68,8364 is translated by each number: 68 => D , 67 => C , and so on, in order to obtain DCΦD€ .
You CAN'T convert from Unicode to ASCII. Almost every character in Unicode cannot be expressed in ASCII, and those that can be expressed have exactly the same codepoints in ASCII as in UTF-8, which is probably what you have.
Have you tried using the encode method?
elems[0].getText().encode('utf-8')
More info about unicode and python can be found in https://docs.python.org/2/howto/unicode.html
Moreover, to discover if your string is really utf-8 encoded you can use chardet and run the following command:
>>> import chardet
>>> chardet.detect(elems[0].getText())
{'confidence': 0.5, 'encoding': 'utf-8'}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With