Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to let Ruby Mechanize get a page which lives in a string

Generally Mechanize will get a webpage from a URL and the result of the get method is a Mechanize::Page object, from which you can use a lot of useful methods.

If the page lives in a string, how do I get the same Mechanize::Page object?

require 'mechanize'

html = <<END_OF_STRING
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
<head>
<meta http-equiv="Content-type" content="text/html; charset=utf-8" />
<title>Page Title</title>
<style type="text/css">
</style>
</head>
<body>
<h1>This is a test</h1>
</body>
</html>
END_OF_STRING

agent = Mechanize.new

# How can I get the page result from the string html?
#page = ...
like image 874
Just a learner Avatar asked Mar 03 '12 19:03

Just a learner


1 Answers

Mechanize uses Nokogiri to parse the HTML. If you are accessing the HTML without the need of an internet transfer protocol you don't need Mechanize. All you are looking to do is to parse the input HTML, right?

The following will let you do this:

require 'Nokogiri'
html = 'html here'
page = Nokogiri::HTML html

If you have the Mechanize gem installed you will already have Nokogiri.

Otherwise you can still create a new Mechanize page using:

require 'Mechanize'
html = 'html here'
a = Mechanize.new
page2 = Mechanize::Page.new(nil,{'content-type'=>'text/html'},html,nil,a)
like image 177
Kassym Dorsel Avatar answered Sep 21 '22 12:09

Kassym Dorsel