Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HTML Linq with HtmlAgilityPack, or alternative, in PCL

I have written a project on .NET 4 and am currently in the process of allowing it to run on Windows Phone as well. I am using HtmlAgilityPack, a well known library which allows Linq queries over HTML, and am only using the LoadHtml and Linq interfaces it provides.

Having converted the class libraries from .NET 4 to PCL (Portable Class Library) with support for .NET 4 and WP8, I cannot seem to use the HtmlAgilityPack library anymore. Is there a way to allow HtmlAgilityPack to function correctly under a PCL project or is there a variable alternative with a similar Linq interface that does work as intended?

EDIT: HtmlAgilityPack provides 9 different versions, none of which are compatible with PCL. None of them resolve dependencies from the references. For some versions, it may appear that it does but upon usage an error will be thrown with the usual 'cannot load, unresolved dependencies'.

EDIT #2 Since it easy to miss a small comment, I'll update this answer with the solution I came up with. I extracted what was needed for basic functionality and implemented the missing components to make everything work. The result is here https://github.com/Deathspike/HtmlAgilityPack-PCL

like image 903
Deathspike Avatar asked Mar 11 '13 14:03

Deathspike


3 Answers

One option is to port the HTML Agility Pack source code to a PCL. You can run the PCL Compliance Analyzer over it to get an idea of how hard this will be.

Alternatively, use the abstraction pattern. Create a portable interface for the functionality you need (ie LoadHtml and Linq), and then implement that interface for each platform by calling into the HTML Agility Pack. Then your portable code can depend on the platform-specific implementation.

For more information, see this blog post: How to Make Portable Class Libraries Work for You

like image 122
Daniel Plaisted Avatar answered Oct 05 '22 02:10

Daniel Plaisted


You've asked and answered your own question - haven't you?

The HtmlAgilityPack does not support use with Portable Class Libraries.

At best you'll need to look at porting/migrating the specific functionality you require in a way that will work on the platforms you are using.

like image 31
Matt Lacey Avatar answered Oct 05 '22 02:10

Matt Lacey


Look at HtmlParserSharp, this is a C# port of the validator.nu HTML5 parser. The project should be very easy to build as a PCL library since it's more or less a straight C++ port and uses only the most basic of .NET framework classes, with a few updates to improve performance in C#.

While most the work I've done with HtmlParserSharp has been for CsQuery, which itself is a long way from being PCL compliant, there is no reason at all that HtmlParserSharp would't work perfectly well on it's own as a lean HTML parser for your purposes. The project includes an example of building a DOM based on an XmlElement, but the tree builder is an abstraction so you could easily change this to use your own tree node objects instead.

like image 39
Jamie Treworgy Avatar answered Oct 05 '22 02:10

Jamie Treworgy