How to extract certain parts of a web page in Python

Question

Target web page: http://www.immi.gov.au/skilled/general-skilled-migration/estimated-allocation-times.htm

The section I want to extract:

  <tr>
  <td>Skilled &ndash; Independent (Residence) subclass 885<br />online</td>
  <td>N/A</td>
  <td>N/A</td>
  <td>N/A</td>
  <td>15 May 2011</td>
  <td>N/A</td>
  </tr>

Once the code finds this section by searching the keyword "subclass 885
online", it should then print the date which is within the 5th tag which is "15 May 2011" as shown above.

It's just a monitor for myself to keep an eye on the progress of my immigration application.

Johnsyweb · Accepted Answer

"Beau--ootiful Soo--oop!

Beau--ootiful Soo--oop!

Soo--oop of the e--e--evening,

Beautiful, beauti--FUL SOUP!"

--Lewis Carroll, Alice's Adventures in Wonderland

I think this is exactly what he had in mind!

The Mock Turtle would probably do something like this:

>>> from BeautifulSoup import BeautifulSoup
>>> import urllib2
>>> url = 'http://www.immi.gov.au/skilled/general-skilled-migration/estimated-allocation-times.htm'
>>> page = urllib2.urlopen(url)
>>> soup = BeautifulSoup(page)
>>> for row in soup.html.body.findAll('tr'):
...     data = row.findAll('td')
...     if data and 'subclass 885online' in data[0].text:
...         print data[4].text
... 
15 May 2011

But I'm not sure it would help, since that date has already passed!

Good luck with the application!

How to extract certain parts of a web page in Python

Tags:

python

html

string

jiaoziren

1 Answers

Johnsyweb

Recent Activity

Donate For Us

How to extract certain parts of a web page in Python

Tags:

python

html

string

jiaoziren

1 Answers

Johnsyweb

Related questions

Recent Activity

Donate For Us