I'm currently getting this error and don't know what is means. Its a scrapy python project, this is the error I'm seeing:
File "/bp_scraper/bp_scraper/httpmiddleware.py", line 22, in from_crawler
return cls(crawler.settings)
File "/bp_scraper/bp_scraper/httpmiddleware.py", line 12, in __init__
if parts[1]:
TypeError: '_sre.SRE_Match' object has no attribute '__getitem__'
The code:
import re
import random
import base64
from scrapy import log
class RandomProxy(object):
def __init__(self, settings):
self.proxy_list = settings.get('PROXY_LIST')
f = open(self.proxy_list)
self.proxies = {}
for l in f.readlines():
parts = re.match('(\w+://)(\w+:\w+@)?(.+)', l)
if parts[1]:
parts[1] = parts[1][:-1]
self.proxies[parts[0] + parts[2]] = parts[1]
f.close()
@classmethod
def from_crawler(cls, crawler):
return cls(crawler.settings)
Thanks in advance for your help!
The result of a re.match
call is a SRE_Match
object, which does not support the []
operator (a.k.a. __getitem__
). I think you want
if parts is not None:
if parts.group(1):
<blah>
Unfortunately, parts.group(1)
is not mutable, so you'll have to make another variable to hold the changes you want to make to it.
You can not access the matched results as:
if parts[1]:
parts[1] = parts[1][:-1]
Instead do this,
if parts:
matched = parts.group(1)[:-1]
More on regex matched groups here
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With