I'm trying to open a webpage using <code>urllib.request.urlopen()</code> then search it with regular expressions, but that gives the following error: <blockquote> TypeError: can't use a string pattern on a bytes-like object </blockquote> I understand why, <code>urllib.request.urlopen()</code> returns a bytestream, so <code>re</code> doesn't know the encoding to use. What am I supposed to do in this situation? Is there a way to specify the encoding method in a urlrequest maybe or will I need to re-encode the string myself? If so what am I looking to do, I assume I should read the encoding from the header info or the encoding type if specified in the html and then re-encode it to that?

As for me, the solution is as following (python3): <pre class="prettyprint"><code>resource = urllib.request.urlopen(an_url) content = resource.read().decode(resource.headers.get_content_charset()) </code></pre>

How to handle response encoding from urllib.request.urlopen() , to avoid TypeError: can't use a string pattern on a bytes-like object

Tags:

python

regex

encoding

urllib

I'm trying to open a webpage using urllib.request.urlopen() then search it with regular expressions, but that gives the following error:

TypeError: can't use a string pattern on a bytes-like object

I understand why, urllib.request.urlopen() returns a bytestream, so re doesn't know the encoding to use. What am I supposed to do in this situation? Is there a way to specify the encoding method in a urlrequest maybe or will I need to re-encode the string myself? If so what am I looking to do, I assume I should read the encoding from the header info or the encoding type if specified in the html and then re-encode it to that?

931

asked Feb 13 '11 02:02

kryptobs2000

1 Answers

As for me, the solution is as following (python3):

resource = urllib.request.urlopen(an_url) content =  resource.read().decode(resource.headers.get_content_charset())

118

answered Sep 22 '22 06:09

Ivan Klass

Related questions
                            
                                How to invert colors of image with PIL (Python-Imaging)?
                            
                                Could not install packages due to an EnvironmentError: [Errno 2] No such file or directory \\METADATA
                            
                                Use TQDM Progress Bar with Pandas
                            
                                Can pip be used with Python Tools in Visual Studio?
                            
                                How to find char in string and get all the indexes?
                            
                                Python name 'os' is not defined [duplicate]
                            
                                How to fix "could not find or load the Qt platform plugin windows" while using Matplotlib in PyCharm
                            
                                formatting long numbers as strings in python
                            
                                ModuleNotFoundError: No module named 'virtualenv.seed.embed.via_app_data' when I created new env by virtualenv
                            
                                Is there a function to make scatterplot matrices in matplotlib?
                            
                                How can I check if a string only contains letters in Python?
                            
                                How can I quickly estimate the distance between two (latitude, longitude) points?
                            
                                How can I get the Unix permission mask from a file? [duplicate]
                            
                                Error #15: Initializing libiomp5.dylib, but found libiomp5.dylib already initialized
                            
                                Django package to generate random alphanumeric string
                            
                                Format string dynamically [duplicate]
                            
                                How to cache downloaded PIP packages [duplicate]
                            
                                How can I define a class in Python?
                            
                                Python: Random numbers into a list
                            
                                Python: count repeated elements in the list [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With