is there any Objective C library for parsing HTML, like python's BeautifulSoup? Thanks
From Apple's part there is NSXMLDocument and NSXMLParser, which support tidied HTML input. (Tree-Based XML Programming Guide)
On iOS (4.3) there's currently no NSXMLDocument available, so you'd have to use either NSXMLParser or libxml2.2.
Some more informations on potential problems with parsing malformed HTML:
What's the best approach for parsing XML/'screen scraping' in iOS? UIWebview or NSXMLParser?
The most reliable solution is to use an off-screen WebView, load the HTML source into it and then access its DOM tree.
The best way I have found is NSXMLParser
+ libtidy
. However, there are many third party libraries are available now which makes parsing easier. (last answer was written in 2011).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With