I'm writing a site that is going to rely a lot on screen scraping. Because I know screen scraping is prone to breaking I'd like to get notified somehow that there is a problem.
The solution that I think will work is to write an rspec test for each site I want to support. The test will open a few remote pages from each site and compare them with the output I expect from my scraper. I'd like to also run the same tests on locally cached copies so I know if my code changes broke the scraper or if the remote site changed. I'd like to somehow run these tests once a day and notify me of any problems.
Eventually I'd like to make this a gem because it's a reoccurring problem for me. I tend to do a lot of scraping and it would be nice to know when things break.
So my problem is I'm relatively new to writing tests for my code and I have no clue what the best way to set this up is.
Take a look at the VCR gem, which will let you get local copies of various pages you want to test, while having the ability to refresh them every so often, as well as testing against live pages.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With