Get all html between two elements

Question

Problem:
Extract all html between two headers including the headers html. The header text is known, but not the formatting, tag name, etc. They are not within the same parent and might (well, almost for sure) have sub children within it's own children).

To clarify: headers could be inside a <h1> or <div> or any other tag. They may also be surrounded by <b>, <i>, <font> or more <div> tags. The key is: the only text within the element is the header text.

The tools I have available are: C# 3.0 utilizing a WebBrowser control, or Jquery/Js.

I've taken the Jquery route, traversing the DOM, but I've ran into the issue of children and adding them appropriately. Here is the code so far:

function getAllBetween(firstEl,lastEl) {
    var collection = new Array(); // Collection of Elements
    var fefound =false;
    $('body').find('*').each(function(){
        var curEl = $(this);
        if($(curEl).text() == firstEl) 
            fefound=true;
        if($(curEl).text() == lastEl) 
            return false;

        // need something to add children children
        // otherwise we get <table></table><tbody></tbody><tr></tr> etc
        if (fefound)
            collection.push(curEl);
    });
    var div = document.createElement("DIV");
    for (var i=0,len=collection.length;i<len;i++){
        $(div).append(collection[i]);
    }
    return($(div).html());
}

Should I be continueing down this road? With some sort of recursive function checking/handling children, or would a whole new approach be better suited?

For the sake of testing, here is some sample markup:

<body>
<div>
<div>Start</div>
<table><tbody><tr><td>Oops</td></tr></tbody></table>
</div>
<div>
<div>End</div>
</div>
</body>

Any suggestions or thoughts are greatly appreciated!

dutt · Accepted Answer

My thought is a regex, something along the lines of

.*<(?<tag>.+)>Start</\1>(?<found_data>.+)<\1>End</\1>.*

should get you everything between the Start and end div tags.

Get all html between two elements

Tags:

javascript

jquery

c#-3.0

WSkid

1 Answers

dutt

Recent Activity

Donate For Us

Get all html between two elements

Tags:

javascript

jquery

c#-3.0

WSkid

1 Answers

dutt

Related questions

Recent Activity

Donate For Us