I wanna use Scrapy to send an email
I read throw the official website, and I found that I can do this:
from scrapy.mail import MailSender
from scrapy.utils.project import get_project_settings
settings = get_project_settings()
mailer = MailSender(mailfrom ="[email protected]", smtphost="smtp.gmail.com", smtpport=465, smtppass ="MySecretPassword")
mailer.send(to=["[email protected]"], subject="Some subject", body="Some body")
The code didn't throw any exception, but there is no mail has been sent.
What did I miss?
I need to work with Scrapy framework, and not pure Python
I don't wanna apply the default settings by using mailer = MailSender.from_settings(settings)
, because as you saw, i have my custom options, and bty i tried to use the default settings, but the same result, no exception, but no emails sent.
I hope you help me
Are you actually using a gmail address as referenced by your code? If so, google typically blocks access to outgoing mail for the first time when you do this. I run into this problem all the time when using PHPMailer. Try running your script first, then visit this link: https://accounts.google.com/displayunlockcaptcha which will give you a continue button from google. Click that continue button which will verify it's you trying to send out the mail. Then try running your script again and see if that works.
Two things come to mind with your code. First, whether the mailer code being executed and, second, the smtpuser
parameter should be populated.
Here is working code to send email via Gmail using Scrapy. This answer has 4 sections: email code, complete example, logging, and Gmail configuration. The complete example is provided as there are a few things that need to be coordinated for this to work.
Email Code
To have Scrapy send email, you can add the following in your Spider class (complete example in next section). These examples have Scrapy send email after crawling has completed.
There are two chunks of code to add, the first to import modules and the second to send the email.
Importing modules:
from scrapy import signals
from scrapy.mail import MailSender
Inside your Spider class definition:
class MySpider(Spider):
<SPIDER CODE>
@classmethod
def from_crawler(cls, crawler):
spider = cls()
crawler.signals.connect(spider.spider_closed, signals.spider_closed)
return spider
def spider_closed(self, spider):
mailer = MailSender(mailfrom="[email protected]",smtphost="smtp.gmail.com",smtpport=587,smtpuser="[email protected]",smtppass="MySecretPassword")
return mailer.send(to=["[email protected]"],subject="Some subject",body="Some body")
Complete Example
Putting this together, this example uses the dirbot example located at:
https://github.com/scrapy/dirbot
Only one file needs to be edited:
./dirbot/spiders/dmoz.py
Here is the entire working file with the imports near the top and the email code at the end of the spider class:
from scrapy.spider import Spider
from scrapy.selector import Selector
from dirbot.items import Website
from scrapy import signals
from scrapy.mail import MailSender
class DmozSpider(Spider):
name = "dmoz"
allowed_domains = ["dmoz.org"]
start_urls = [
"http://www.dmoz.org/Computers/Programming/Languages/Python/Books/",
"http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/",
]
def parse(self, response):
"""
The lines below is a spider contract. For more info see:
http://doc.scrapy.org/en/latest/topics/contracts.html
@url http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/
@scrapes name
"""
sel = Selector(response)
sites = sel.xpath('//ul[@class="directory-url"]/li')
items = []
for site in sites:
item = Website()
item['name'] = site.xpath('a/text()').extract()
item['url'] = site.xpath('a/@href').extract()
item['description'] = site.xpath('text()').re('-\s[^\n]*\\r')
items.append(item)
return items
@classmethod
def from_crawler(cls, crawler):
spider = cls()
crawler.signals.connect(spider.spider_closed, signals.spider_closed)
return spider
def spider_closed(self, spider):
mailer = MailSender(mailfrom="[email protected]",smtphost="smtp.gmail.com",smtpport=587,smtpuser="[email protected]",smtppass="MySecretPassword")
return mailer.send(to=["[email protected]"],subject="Some subject",body="Some body")
Once this file is updated, run the standard crawl command from the project directory to crawl and send the email:
$ scrapy crawl dmoz
Logging
By returning the output of the mailer.send
method in the spider_closed
method, Scrapy will automatically add the result to its log. Here are examples of successes and failures:
Success Log Message:
2015-03-22 23:24:30-0000 [scrapy] INFO: Mail sent OK: To=['[email protected]'] Cc=None Subject="Some subject" Attachs=0
Error Log Message - Unable to Connect:
2015-03-22 23:39:45-0000 [scrapy] ERROR: Unable to send mail: To=['[email protected]'] Cc=None Subject="Some subject" Attachs=0- Unable to connect to server.
Error Log Message - Authentication Failure:
2015-03-22 23:38:29-0000 [scrapy] ERROR: Unable to send mail: To=['[email protected]'] Cc=None Subject="Some subject" Attachs=0- 535 5.7.8 Username and Password not accepted. Learn more at 5.7.8 http://support.google.com/mail/bin/answer.py?answer=14257 sb4sm6116233pbb.5 - gsmtp
Gmail Configuration
To configure Gmail to accept email this way, you need to enable "Access for less secure apps" which you can do at the URL below when you are logged in to the account:
https://www.google.com/settings/security/lesssecureapps
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With