Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Beautiful Soup - Class contains 'a' and not contains 'b'

Using bs4 I need to find an element with class_=re.compile("viewLicense") but not class_="viewLicenseDetails"

Here is the snippet,

<tr class="viewLicense inactive"></tr>
<tr class="viewLicense"></tr>
<tr id="licenseDetails_552738" class="viewLicenseDetails"</tr>

I want the first two tr and not want the last one.

Could someone please help, Thanks

like image 636
Md. Mohsin Avatar asked Oct 12 '14 03:10

Md. Mohsin


2 Answers

Following will find every tr tag with viewLicense

soup.find_all("tr", class_="viewLicense")

So, it will work for the text provided in quesiton:

>>> soup.find_all("tr", class_="viewLicense")
[<tr class="viewLicense inactive"></tr>, <tr class="viewLicense"></tr>]

However if you have a tr tag which has both viewLicense and viewLicenseDetails classes, then following will find all tr tags with viewLicense and then remove tags with viewLicenseDetails:

>>> both_tags = soup.find_all("tr", class_="viewLicense")
>>> for tag in both_tags:
...     if 'viewLicenseDetails' not in tag.attrs['class']:
...             print tag
like image 135
avi Avatar answered Oct 21 '22 07:10

avi


Use CSS selectors?

results = soup.select('tr.viewLicense')
like image 24
DivinusVox Avatar answered Oct 21 '22 06:10

DivinusVox