I am using BeautifulSoup to get all links of mobile phones from this url http://www.gsmarena.com/samsung-phones-f-9-0-p2.php
My code for the following is :
import urllib2
from BeautifulSoup import BeautifulSoup
url = "http://www.gsmarena.com/samsung-phones-f-9-0-p2.php"
text = urllib2.urlopen(url).read();
soup = BeautifulSoup(text);
data = soup.findAll('div',attrs={'class':'makers'});
for i in data:
print "http://www.gsmarena.com/" + i.ul.li.a['href'];
But the returned list of urls is shorter than the expected output when i checked, this code outputs 3 values but the result should show much over 10 values
There are only three <div>
elements in that page with a class of 'makers', this will print the first link from each div, so three in all.
This is likely closer to what you desire:
import urllib2
from BeautifulSoup import BeautifulSoup
url = "http://www.gsmarena.com/samsung-phones-f-9-0-p2.php"
text = urllib2.urlopen(url).read()
soup = BeautifulSoup(text)
data = soup.findAll('div',attrs={'class':'makers'})
for div in data:
links = div.findAll('a')
for a in links:
print "http://www.gsmarena.com/" + a['href']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With