Example html:
<div>
    <p>p1</p>
    <p>p2</p>
    <p>p3<span id="target">starting from here</span></p>
    <p>p4</p>
</div>
<div>
    <p>p5</p>
    <p>p6</p>
</div>
<p>p7</p>
I want to search for <p>s but only if its position is after span#target. 
It should return p4, p5, p6 and p7 in the above example.
I tried to get all <p>s first then filter, but then I don't know how do I judge if an element is after span#target or not, either.
You can do this by using the find_all_next function in beautifulsoup.
from bs4 import BeautifulSoup
doc = # Read the HTML here
# Parse the HTML
soup = BeautifulSoup(doc, 'html.parser')
# Select the first element you want to use as the reference
span = soup.select("span#target")[0]
# Find all elements after the `span` element that have the tag - p
print(span.find_all_next("p"))
The above snippet will result in
[<p>p4</p>, <p>p5</p>, <p>p6</p>, <p>p7</p>]
Edit: As per the request to compare position below by OP-
If you want to compare position of 2 elements, you'll have to rely on sourceline and sourcepos provided by the html.parser and html5lib parsing options.
First off, store the sourceline and/or sourcepos of your reference element in a variable.
span_srcline = span.sourceline
span_srcpos = span.sourcepos
(you don't actually have to store them though, you can just do span.sourcepos directly as long as you have the span stored)
Now iterate through the result of find_all_next and compare the values-
for tag in span.find_all_next("p"):
    print(f'line diff: {tag.sourceline - span_srcline}, pos diff: {tag.sourcepos - span_srcpos}, tag: {tag}')
You're most likely interested in line numbers though, as the sourcepos denotes the position on a line.
However, sourceline and sourcepos mean slightly different things for each parser. Check the docs for that info
Try this
html_doc = """
<div>
    <p>p1</p>
    <p>p2</p>
    <p>p3<span id="target">starting from here</span></p>
    <p>p4</p>
</div>
<div>
    <p>p5</p>
    <p>p6</p>
</div>
<p>p7</p>
"""
from bs4 import BeautifulSoup
soup = BeautifulSoup(html_doc, 'html.parser')
print(soup.find(id="target").findNext('p').contents[0])
Result
p4
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With