I am trying to read some data from a python module from a web.
I manage to read, however having some difficulty in parsing this data and getting the required information.
My code is below. Any help is appreciated.
#!/usr/bin/python2.7 -tt
import urllib
import urllib2
def Connect2Web():
aResp = urllib2.urlopen("https://uniservices1.uobgroup.com/secure/online_rates/gold_and_silver_prices.jsp");
web_pg = aResp.read();
print web_pg
#Define a main() function that prints a litte greeting
def main():
Connect2Web()
# This is the standard boilerplate that calls the maun function.
if __name__ == '__main__':
main()
When I print this web page I get the whole web page printed.
I want to extract some information from it, (e.g. "SILVER PASSBOOK ACCOUNT"
and get the rate from it), I am having some difficulties in parsing this html document.
It's not recommended to use RE to match XML/HTML. It can sometimes work, however. It's better to use an HTML parser and a DOM API. Here's an example:
import html5lib
import urllib2
aResp = urllib2.urlopen("https://uniservices1.uobgroup.com/secure/online_rates/gold_and_silver_prices.jsp")
t = aResp.read()
dom = html5lib.parse(t, treebuilder="dom")
trlist = dom.getElementsByTagName("tr")
print trlist[-3].childNodes[1].firstChild.childNodes[0].nodeValue
You could iterate over trlist
to find your interesting data.
Added from comment: html5lib
is third party module. See html5lib site. The easy_install
or pip
program should be able to install it.
It's possible to use regexps to get required data:
import urllib
import urllib2
import re
def Connect2Web():
aResp = urllib2.urlopen("https://uniservices1.uobgroup.com/secure/online_rates/gold_and_silver_prices.jsp");
web_pg = aResp.read();
pattern = "<td><b>SILVER PASSBOOK ACCOUNT</b></td>" + "<td>(.*)</td>" * 4
m = re.search(pattern, web_pg)
if m:
print "SILVER PASSBOOK ACCOUNT:"
print "\tCurrency:", m.group(1)
print "\tUnit:", m.group(2)
print "\tBank Sells:", m.group(3)
print "\tBank Buys:", m.group(4)
else:
print "Nothing found"
Don't forget to re.compile
the pattern if you are doing your matches in loop.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With