Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Beautiful Soup if Class "Contains" or Regex?

If my class names are constantly different say for example:

listing-col-line-3-11 dpt 41 listing-col-block-1-22 dpt 41 listing-col-line-4-13 CWK 12 

Normally I could do:

for EachPart in soup.find_all("div", {"class" : "ClassNamesHere"}):             print EachPart.get_text() 

There are way too many class names to work with here so a bunch of these are out.

I know Python doesn't have a ".contains" I would normally use but it does have an "in". Though I haven't been able to work out a way to incorporate that.

I'm hoping there's a way to do this with regex. Though again my Python syntax is really letting me down I've been trying variations on:

regex = re.compile('.*listing-col-.*')     for EachPart in soup.find_all(regex): 

But that doesn't seem to be doing the trick.

like image 739
PoweredByCoffee Avatar asked Jan 07 '16 16:01

PoweredByCoffee


1 Answers

BeautifulSoup supports CSS selectors which allow you to select elements based on the content of particular attributes. This includes the selector *= for contains.

The following will return all div elements with a class attribute containing the text 'listing-col-':

for EachPart in soup.select('div[class*="listing-col-"]'):     print EachPart.get_text() 
like image 117
mfitzp Avatar answered Sep 21 '22 15:09

mfitzp