Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parsing HTML with c#.net [duplicate]

I'm trying to parse the following HTML file, I'd like the get the value of key. This is being done on Silverlight for Windows phone.

<HTML> <link ref="shortcut icon" href="favicon.ico"> <BODY> <script Language="JavaScript"> location.href="login.html?key=UEFu1EIsgGTgAV7guTRhsgrTQU28TImSZkYhPMLj7BChpBkvlCO11aJU2Alj4jc5" </script> <CENTER><a href="login.html?key=UEFu1EIsgGTgAV7guTRhsgrTQU28TImSZkYhPMLj7BChpBkvlCO11aJU2Alj4jc5">Welcome</a></CENTER></BODY></HTML> 

any idea's on where to go from here?

thanks

like image 963
Nathan Avatar asked May 19 '11 18:05

Nathan


People also ask

What is HTML parser in C?

HTML Parser in C/C++ HTML Parser is a program/software by which useful statements can be extracted, leaving html tags (like <h1>, <span>, <p> etc) behind. Examples: Input: <h1>Geeks for Geeks</h1> Output: Geeks for Geeks.

Can I use HTML in C?

The C compiler is designed to be able to extract and compile C code embedded within HTML files. This capability means that C code can be written to be displayed within a browser utilizing the full formatting and display capability of HTML.

Can HTML be parsed?

The browser parses HTML into a DOM tree. HTML parsing involves tokenization and tree construction. HTML tokens include start and end tags, as well as attribute names and values. If the document is well-formed, parsing it is straightforward and faster.


1 Answers

Give the HTMLAgilityPack a look into. Its a pretty decent HTML parser

http://html-agility-pack.net/?z=codeplex

Here's some code to get you started (requires error checking)

HtmlDocument document = new HtmlDocument();  string htmlString = "<html>blabla</html>"; document.LoadHtml(htmlString); HtmlNodeCollection collection = document.DocumentNode.SelectNodes("//a"); foreach (HtmlNode link in collection) {      string target = link.Attributes["href"].Value; } 
like image 125
Kurru Avatar answered Oct 04 '22 09:10

Kurru