How to use CSS selectors to retrieve specific links lying in some class using BeautifulSoup?

Tags:

I am new to Python and I am learning it for scraping purposes I am using BeautifulSoup to collect links (i.e href of 'a' tag). I am trying to collect the links under the "UPCOMING EVENTS" tab of site http://allevents.in/lahore/. I am using Firebug to inspect the element and to get the CSS path but this code returns me nothing. I am looking for the fix and also some suggestions for how I can choose proper CSS selectors to retrieve desired links from any site. I wrote this piece of code:

from bs4 import BeautifulSoup  import requests  url = "http://allevents.in/lahore/"  r  = requests.get(url)  data = r.text  soup = BeautifulSoup(data) for link in soup.select( 'html body div.non-overlay.gray-trans-back div.container div.row div.span8 div#eh-1748056798.events-horizontal div.eh-container.row ul.eh-slider li.h-item div.h-meta div.title a[href]'):     print link.get('href')

710

asked Jul 17 '14 10:07

Flecha

1 Answers

The page is not the most friendly in the use of classes and markup, but even so your CSS selector is too specific to be useful here.

If you want Upcoming Events, you want just the first <div class="events-horizontal">, then just grab the <div class="title"><a href="..."></div> tags, so the links on titles:

upcoming_events_div = soup.select_one('div.events-horizontal') for link in upcoming_events_div.select('div.title a[href]'):     print(link['href'])

Note that you should not use r.text; use r.content and leave decoding to Unicode to BeautifulSoup. See Encoding issue of a character in utf-8

170

answered Sep 19 '22 13:09

Martijn Pieters

Related questions
                            
                                Changing the active class of a link with the twitter bootstrap css in python/flask
                            
                                django i18n: Make sure you have GNU gettext tools
                            
                                How to programmatically set a global (module) variable?
                            
                                Preventing a class from direct instantiation in Python
                            
                                python requests ssl handshake failure
                            
                                Inserting a value into all possible locations in a list
                            
                                Catch any error in Python [duplicate]
                            
                                How to get the name of an open file?
                            
                                Convert whole dataframe from lower case to upper case with Pandas
                            
                                'module' object has no attribute 'loads' while parsing JSON using python
                            
                                Django app works fine, but getting a TEMPLATE_* warning message
                            
                                Conditional Logic on Pandas DataFrame
                            
                                Randomly mix lines of 3 million-line file
                            
                                Relative Strength Index in python pandas
                            
                                error: command 'gcc' failed with exit status 1 on CentOS
                            
                                Convert bytes to bits in python
                            
                                Python - Printing a dictionary as a horizontal table with headers
                            
                                About 20 models in 1 django app
                            
                                Anaconda Runtime Error: Python is not installed as a framework?
                            
                                How to include two pictures side by side in Markdown for IPython Notebook (Jupyter)?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to use CSS selectors to retrieve specific links lying in some class using BeautifulSoup?

Tags:

python

css

css-selectors

beautifulsoup

firebug

Flecha

People also ask

1 Answers

Martijn Pieters

Recent Activity

Donate For Us