I want to build a simple website that can download a webpage www.example.com/index.html
and store its snapshot on the server when the client requests. I'm thinking about using the command wget
to download the webpage. Would Ruby on Rails be able to handle this task?
Yes.
You can perform shell commands in Ruby via back ticks, exec and system. Note that each one returns something slightly different:
back ticks
`wget http://www.yahoo.com`
exec
:
exec('wget http://www.yahoo.com')
system
:
system('wget http://www.yahoo.com')
This blog post seems to be in the same vein as what you're trying to do.
Additionally, there are several terrific Ruby libraries for doing this:
They will provide a much better cleaner Ruby interface for dealing with the data that comes back from the various requests.
The best way to test all of these options is to use the Rails console. Go to the root directory of your Rails app and type:
rails c
Once in the console, you can emulate the actual server calls.
Running wget
in your console will drop the files in your Rails root directory, which is not what you want. tmp
is a standard directory for such things. You can dynamically generate the path based on the URL like so:
# tmp directory
path = Rails.root.join('tmp')
# create sub-directory as md5 hash based on URL
sub_dir = Digest::MD5.hexdigest(url)
# append sub_dir on the path
destination_path = path.join(sub_dir)
system("wget -P #{destination_path} #{url}")
Be sure to also include the options from this post
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With