Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python fetching <title>

Tags:

python

urllib2

I want to fetch the title of a webpage which I open using urllib2. What is the best way to do this, to parse the html and find what I need (for now only the -tag but might need more in the future).

Is there a good parsing lib for this purpose?

like image 995
xintron Avatar asked Nov 02 '09 09:11

xintron


People also ask

What does title () do in Python?

Python String title() Method The title() method returns a string where the first character in every word is upper case. Like a header, or a title. If the word contains a number or a symbol, the first letter after that will be converted to upper case.

How do you use the title command in Python?

The Python title() function is used to change the initial character in each word to Uppercase and the subsequent characters to Lowercase and then returns a new string. Python title() method returns a title-cased string by converting the initial letter of each word to a capital letter.

How do you make something a title in Python?

Introduction to the Python title() method To make titlecased version of a string, you use the string title() method. The title() returns a copy of a string in the titlecase. The title() method converts the first character of each words to uppercase and the remaining characters in lowercase.


2 Answers

Yes I would recommend BeautifulSoup

If you're getting the title it's simply:

soup = BeautifulSoup(html)
myTitle = soup.html.head.title

or

myTitle = soup('title')

Taken from the documentation

It's very robust and will parse the html no matter how messy it is.

like image 152
RobbR Avatar answered Nov 15 '22 04:11

RobbR


Try Beautiful Soup:

url = 'http://www.example.com'
response = urllib2.urlopen(url)
html = response.read()

soup = BeautifulSoup(html)
title = soup.html.head.title
print title.contents
like image 21
Dominic Rodger Avatar answered Nov 15 '22 04:11

Dominic Rodger