Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python - Getting all links from a div having a class

Tags:

python

I am using BeautifulSoup to get all links of mobile phones from this url http://www.gsmarena.com/samsung-phones-f-9-0-p2.php

My code for the following is :

import urllib2
from BeautifulSoup import BeautifulSoup

url = "http://www.gsmarena.com/samsung-phones-f-9-0-p2.php"
text = urllib2.urlopen(url).read();
soup = BeautifulSoup(text);

data = soup.findAll('div',attrs={'class':'makers'});
for i in data:
    print "http://www.gsmarena.com/" + i.ul.li.a['href'];

But the returned list of urls is shorter than the expected output when i checked, this code outputs 3 values but the result should show much over 10 values

like image 461
Akshay Patil Avatar asked Dec 23 '11 14:12

Akshay Patil


1 Answers

There are only three <div> elements in that page with a class of 'makers', this will print the first link from each div, so three in all.

This is likely closer to what you desire:

import urllib2
from BeautifulSoup import BeautifulSoup

url = "http://www.gsmarena.com/samsung-phones-f-9-0-p2.php"
text = urllib2.urlopen(url).read()
soup = BeautifulSoup(text)

data = soup.findAll('div',attrs={'class':'makers'})
for div in data:
    links = div.findAll('a')
    for a in links:
        print "http://www.gsmarena.com/" + a['href']
like image 139
Simon Whitaker Avatar answered Oct 23 '22 12:10

Simon Whitaker