I am building a crawler, I know how to use ruby mechanize to read a page from the net using this code:
require 'mechanize'
agent = Mechanize.new
agent.get "http://google.com"
But can I use Mechanize to read an HTML file from the file system? How?
just using the file:// protocol worked great for me:
html_dir = File.dirname(__FILE__)
page = agent.get("file:///#{html_dir}/example-file.html")
and about the raised question why someone would use mechanize to read local html files: I found it necessary for testing purposes - just store an example file locally and run your rspec against it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With