I asked the question in a codeplex discussion but I hope to get a quicker answer here at stackoverflow.
So, I use HTML Agility Pack for HTML parsing in C#. I have the following html structure:
<body>
<p class="paragraph">text</p>
<p class="paragraph">text</p>
<p class="specific">text</p>
<p class="paragraph">text</p>
<p class="paragraph">text</p>
</body>
And I need to get all p elements with class "paragraph" that exist after the p element with class "specific".
Is there a way to do that?
Thanks.
using .Class as in Mark's example (if that doesnt exist, substitute whatever is appropriate)
Use SkipWhile
e.g. in LINQPad you get 5,6,7
from:
int[] a = { 6, 5, 6 ,7 };
a.SkipWhile(x=>x!=6).Skip(1).Dump();
So depending on the type SelectNodes returns, either:
.SelectNodes( "/p" ).SkipWhile( p => p.Class != "specific" ).Skip(1)
or
.SelectNodes( "/p" ).Cast<XX>().SkipWhile( p => p.Class != "specific" ).Skip(1)
(or, ugly version)
.SelectNodes( "/p" ).SkipWhile( p => ((XX)p).Class != "specific" ).Skip(1)
(or in some cases - not if your expression is already filtering appropriately)
.SelectNodes( "/p" ).OfType<XX>().SkipWhile( p => p.Class != "specific" ).Skip(1)
EDIT: I'd probably create an extension method:
static class HapExtensions
{
public IEnumerable<T> SkipUntilAfter( this IEnumerable<T> sequence, Predicate<T> predicate) {
return sequence.SkipWhile( predicate).Skip(1);
}
}
Anyone care to search up prior art for this? Any good name suggestions?
Try this
bool latterDayParagraphs = false;
List<DocumentNode> nodes = new List<DocumentNode>();
foreach(var pElement in doc.DocumentNode.SelectNodes("/p"))
{
if(pElement.Class != "paragraph")
{
latterDayParagraphs = true;
continue;
}
if(latterDayParagraphs)
{
nodes.Add(pElement);
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With