I am deploying a script (a Scrapy python one) on Heroku, and I want it to be launched 4 times in the morning.
I can definitely run it by connecting to my Heroku account (I have a free plan) and typing this on the windows command line:
heroku run scrapy crawl sytadin
But I am having some issues when I try to run it through Heroku Scheduler. It asks me if I want to write something like $ rake
.
I never used rake
before, is it something to use before run
or after run
?
Should I use the keyword heroku
first?
I have no idea, and everything I tried failed, as I can see in the log :
2017-01-19T23:47:05.305039+00:00 heroku[scheduler.3450]: Starting process with command `python "sytadin" crawl`
2017-01-19T23:47:05.974030+00:00 heroku[scheduler.3450]: State changed from starting to up
2017-01-19T23:47:08.335845+00:00 heroku[scheduler.3450]: State changed from up to complete
2017-01-19T23:47:08.204289+00:00 app[scheduler.3450]: /app/.heroku/python/bin/python: can't find '__main__' module in 'sytadin'
2017-01-19T23:47:08.326081+00:00 heroku[scheduler.3450]: Process exited with status 1
2017-01-19T23:48:27.681890+00:00 app[api]: Starting process with command `python sytadin/sytadin.py crawl` by user [email protected]
2017-01-19T23:48:35.571615+00:00 heroku[scheduler.6352]: Starting process with command `python sytadin/sytadin.py crawl`
2017-01-19T23:48:36.156250+00:00 heroku[scheduler.6352]: State changed from starting to up
2017-01-19T23:48:37.424920+00:00 heroku[scheduler.6352]: Process exited with status 2
2017-01-19T23:48:37.360306+00:00 app[scheduler.6352]: python: can't open file 'sytadin/sytadin.py': [Errno 2] No such file or directory
2017-01-19T23:48:37.445476+00:00 heroku[scheduler.6352]: State changed from up to complete
As you can see I tried different possibilities I found on the web, but it doesn't work properly :(
Any guess for my python script? :)
Uploading the ScriptOpen the file using a text editor and add any dependencies needed such as numpy in order to run your project as when you deploy to Heroku the “pip” install command will be running to make sure all dependencies are present in order to run the script. 3. git add .
Double-click on the Task Scheduler, and then choose the option to 'Create Basic Task…' Type a name for your task (you can also type a description if needed), and then press Next. For instance, let's name the task as: Run Hello World. Choose to start the task 'Daily' since we wish to run the Python script daily at 6am.
Go to your Dashboard and then to Files and then to the Upload a File button and upload the Python file you want to schedule for execution. Go to Tasks and set the time (in UTC) of the day that you want your script to be executed and type in the name of the Python file you uploaded (e.g., myscript.py).
The Heroku scheduler basically just does heroku run
+ whatever command you type there.
So, in your case, since your scrapy crawler successfully runs when you do: heroku run scrapy crawl sytadin
, you can create a scheduler rule to run:
scrapy crawl sytadin
And that will do the trick =)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With