I have a list in jsoup like this:
Elements tbody = new Elements();
tbody might look like this (---- separates elements in tbody list):
<td> 
 <div data-emission="56b2140adb6da7bf3cbf6228" class="mainCell"> 
  <a href="/tv/weather-country-12457/"> <span class="left">16:00</span> 
   <div> 
    <p>Weather - country</p> 
   </div> </a> 
 </div> 
 <div data-emission="56b2140adb6da7bf3cbf6237" class="mainCell shows pending"> 
  <a href="/shows/that's-70-show-550347/epi1201/"> <span class="left">16:10</span> 
   <div> 
    <p>That's 70 show</p> 
    <span class="info">epi. 1201, Show</span> 
   </div> <p class="onAir"> <span>Pending</span> <u></u> <u style="width: 5%"></u> </p> </a> 
 </div> </td>
 ---------------------------------------------------------------------------
 <td> 
 <div data-emission="56b23876db6da7bf3cbf6588" class="mainCell pending"> 
  <a href="/tv/weather-563806/"> <span class="left">16:10</span> 
   <div> 
    <p>Weather</p> 
   </div> <p class="onAir"> <span>Pending</span> <u></u> <u style="width: 51%"></u> </p> </a> 
 </div> 
 <div data-emission="56b23876db6da7bf3cbf6589" class="mainCell"> 
  <a href="/tv/animal-cops-2615/"> <span class="left">16:15</span> 
   <div> 
    <p>Animal Cops</p> 
    <span class="info">epi. 3079, Show</span> 
   </div> </a> 
 </div> 
 <div data-emission="56b23876db6da7bf3cbf658a" class="mainCell shows"> 
  <a href="/show/house-md-1601/odc137/"> <span class="left">16:30</span> 
   <div> 
    <p>House MD</p> 
    <span class="info">epi. 137, Show</span> 
   </div> </a> 
 </div> </td>
 ---------------------------------------------------------------------------
 <td> 
 <div data-emission="56b213b3db6da7bf3cbf61a1" class="mainCell movies pending"> 
  <a href="/movie/star-trek-564170/"> <span class="left">16:00</span> 
   <div> 
    <p>Star Trek</p> 
    <span class="info">Movie</span> 
    <span class="szh prem">| Premiere</span> 
   </div> <p class="onAir"> <span>Pending</span> <u></u> <u style="width: 21%"></u> </p> </a> 
 </div> </td>
My goal is to remove every movie/show that is pending/onAir. So in this example i would like to get rid of a whole div that has:
that's 70 show weather star trekf.e:
for(int i = 0; i < tbody.size(); i++){
            tbody.get(i).select("div").select("p").select(".onAir").remove();
        }
It removes only an element itself, not a whole div. I have tried in many ways but unsuccessfully. I will appreciate any help.
It seems that the pending shows also carry the pending css class. If this is true for all cases you can do it very simply by:
doc.select("td>div.pending").remove();
This will remove all div elements with the pending class from the document doc. if they are direct children of a td element.
Alternatively, you can use your approach and filter for the p element with the correct onAir class and inner text:
doc.select("td>div:has(p.onAir:contains(Pending))").remove();
See the CSS selector syntax to understand the power of Jsoup.
Try following code snippet.
Elements mainCells = tbody.select("div.mainCell");
for(int i = 0; i < mainCells.size(); i++){
    Elements mainCellsP = mainCells.get(i).select("div").select("a").select("p");
    if (mainCellsP.size() == 2) {
        // Remove this node from DOM tree
        mainCells.get(i).remove();
    }
}
First select the appropriate node you want to delete and then call remove() method of that node.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With