Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading HTML content from a UIWebView

Is it possible to read the raw HTML content of a web page that has been loaded into a UIWebView?

If not, is there another way to pull raw HTML content from a web page in the iPhone SDK (such as an equivalent of the .NET WebClient::openRead)?

like image 598
Fuzzy Purple Monkey Avatar asked Jun 14 '09 07:06

Fuzzy Purple Monkey


People also ask

How do I import HTML into WKWebView?

How to load a HTML string into a WKWebView or UIWebView: loadHTMLString() If you want to load resources from a particular place, such as JavaScript and CSS files, you can set the baseURL parameter to any URL .

What is UIWebView used for?

What is UIWebView? UIWebView is a deprecated iOS user interface control in Apple's UIKit framework. It loads HTML files and web content into an app view, rendering them as they would appear in a browser window.

What does UI web view mean?

Android is powered by Chrome. Mobile Safari UIWebView. The UIWebView is different from the ordinary Safari browser, as it is not a stand-alone browser, but merely browser functionality that is embedded in a third party app that allows the app to display content from the web.


2 Answers

The second question is actually easier to answer. Look at the stringWithContentsOfURL:encoding:error: method of NSString - it lets you pass in a URL as an instance of NSURL (which can easily be instantiated from NSString) and returns a string with the complete contents of the page at that URL. For example:

NSString *googleString = @"http://www.google.com"; NSURL *googleURL = [NSURL URLWithString:googleString]; NSError *error; NSString *googlePage = [NSString stringWithContentsOfURL:googleURL                                                  encoding:NSASCIIStringEncoding                                                    error:&error]; 

After running this code, googlePage will contain the HTML for www.google.com, and error will contain any errors encountered in the fetch. (You should check the contents of error after the fetch.)

Going the other way (from a UIWebView) is a bit trickier, but is basically the same concept. You'll have to pull the request from the view, then do the fetch as before:

NSURL *requestURL = [[yourWebView request] URL]; NSError *error; NSString *page = [NSString stringWithContentsOfURL:requestURL                                            encoding:NSASCIIStringEncoding                                              error:&error]; 

EDIT: Both these methods take a performance hit, however, since they do the request twice. You can get around this by grabbing the content from a currently-loaded UIWebView using its stringByEvaluatingJavascriptFromString: method, as such:

NSString *html = [yourWebView stringByEvaluatingJavaScriptFromString:                                           @"document.body.innerHTML"]; 

This will grab the current HTML contents of the view using the Document Object Model, parse the JavaScript, then give it to you as an NSString* of HTML.

Another way is to do your request programmatically first, then load the UIWebView from what you requested. Let's say you take the second example above, where you have NSString *page as the result of a call to stringWithContentsOfURL:encoding:error:. You can then push that string into the web view using loadHTMLString:baseURL:, assuming you also held on to the NSURL you requested:

[yourWebView loadHTMLString:page baseURL:requestURL]; 

I'm not sure, however, if this will run JavaScript found in the page you load (the method name, loadHTMLString, is somewhat ambiguous, and the docs don't say much about it).

For more info:

  • UIWebView class reference
  • NSString class reference
  • NSURL class reference
like image 77
Tim Avatar answered Nov 03 '22 13:11

Tim


if you want to extract the contents of an already-loaded UIWebView, -stringByEvaluatingJavaScriptFromString. For example:

NSString  *html = [webView stringByEvaluatingJavaScriptFromString: @"document.body.innerHTML"]; 
like image 37
Ben Gottlieb Avatar answered Nov 03 '22 14:11

Ben Gottlieb