I have a list in jsoup
like this:
Elements tbody = new Elements();
tbody
might look like this (----
separates elements in tbody
list):
<td>
<div data-emission="56b2140adb6da7bf3cbf6228" class="mainCell">
<a href="/tv/weather-country-12457/"> <span class="left">16:00</span>
<div>
<p>Weather - country</p>
</div> </a>
</div>
<div data-emission="56b2140adb6da7bf3cbf6237" class="mainCell shows pending">
<a href="/shows/that's-70-show-550347/epi1201/"> <span class="left">16:10</span>
<div>
<p>That's 70 show</p>
<span class="info">epi. 1201, Show</span>
</div> <p class="onAir"> <span>Pending</span> <u></u> <u style="width: 5%"></u> </p> </a>
</div> </td>
---------------------------------------------------------------------------
<td>
<div data-emission="56b23876db6da7bf3cbf6588" class="mainCell pending">
<a href="/tv/weather-563806/"> <span class="left">16:10</span>
<div>
<p>Weather</p>
</div> <p class="onAir"> <span>Pending</span> <u></u> <u style="width: 51%"></u> </p> </a>
</div>
<div data-emission="56b23876db6da7bf3cbf6589" class="mainCell">
<a href="/tv/animal-cops-2615/"> <span class="left">16:15</span>
<div>
<p>Animal Cops</p>
<span class="info">epi. 3079, Show</span>
</div> </a>
</div>
<div data-emission="56b23876db6da7bf3cbf658a" class="mainCell shows">
<a href="/show/house-md-1601/odc137/"> <span class="left">16:30</span>
<div>
<p>House MD</p>
<span class="info">epi. 137, Show</span>
</div> </a>
</div> </td>
---------------------------------------------------------------------------
<td>
<div data-emission="56b213b3db6da7bf3cbf61a1" class="mainCell movies pending">
<a href="/movie/star-trek-564170/"> <span class="left">16:00</span>
<div>
<p>Star Trek</p>
<span class="info">Movie</span>
<span class="szh prem">| Premiere</span>
</div> <p class="onAir"> <span>Pending</span> <u></u> <u style="width: 21%"></u> </p> </a>
</div> </td>
My goal is to remove every movie/show that is pending/onAir. So in this example i would like to get rid of a whole div
that has:
that's 70 show
weather
star trek
f.e:
for(int i = 0; i < tbody.size(); i++){
tbody.get(i).select("div").select("p").select(".onAir").remove();
}
It removes only an element itself, not a whole div
. I have tried in many ways but unsuccessfully. I will appreciate any help.
It seems that the pending shows also carry the pending
css class. If this is true for all cases you can do it very simply by:
doc.select("td>div.pending").remove();
This will remove all div
elements with the pending
class from the document doc. if they are direct children of a td
element.
Alternatively, you can use your approach and filter for the p
element with the correct onAir
class and inner text:
doc.select("td>div:has(p.onAir:contains(Pending))").remove();
See the CSS selector syntax to understand the power of Jsoup.
Try following code snippet.
Elements mainCells = tbody.select("div.mainCell");
for(int i = 0; i < mainCells.size(); i++){
Elements mainCellsP = mainCells.get(i).select("div").select("a").select("p");
if (mainCellsP.size() == 2) {
// Remove this node from DOM tree
mainCells.get(i).remove();
}
}
First select the appropriate node you want to delete and then call remove() method of that node.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With