libxml2 HTML parsing

Question

I'm parsing HTML with libxml2, using XPath to find elements. Once I found the element I'm looking for, how can I get the HTML as a string from that element (keeping in mind that this element will have many child elements). Given a document:

<html>
    <header>
        <title>Some document</title>
    </header

    <body>
        <p id="faq">
            Some kind of text <a href="http://www.nowhere.com/">here</a>.
        </p>
    </body>
</html>

Say I retrieved the body element with XPath and then get the HTML for that, I'd like to end up with a string containing:

<body>
    <p id="faq">
        Some kind of text <a href="http://www.nowhere.com/">here</a>.
    </p>
</body>

How can I do this?

Matthew Flaschen · Accepted Answer

That is the purpose of xmlNodeDump:

EDIT:

When you have a xmlNodePtr node, do something like:

xmlBufferPtr nodeBuffer = xmlBufferCreate();
xmlNodeDump(nodeBuffer, doc, node, 0, 1);
// ... Do something with nodeBuffer->content
// When done:
xmlBufferFree(nodeBuffer);

The 4th and 5th parameters control indentation and formatting.

libxml2 HTML parsing

Tags:

c

html

html-parsing

libxml2

johndoe

1 Answers

Matthew Flaschen

Recent Activity

Donate For Us

libxml2 HTML parsing

Tags:

c

html

html-parsing

libxml2

johndoe

1 Answers

Matthew Flaschen

Related questions

Recent Activity

Donate For Us