What would be the simplest way to get the title of a page in Requests? <pre class="prettyprint"><code>r = requests.get('http://www.imdb.com/title/tt0108778/') # ? r.title Friends (TV Series 1994–2004) - IMDb </code></pre>

You need an HTML parser to parse the HTML response and get the <code>title</code> tag's text: Example using <code>lxml.html</code>: <pre class="prettyprint"><code>>>> import requests >>> from lxml.html import fromstring >>> r = requests.get('http://www.imdb.com/title/tt0108778/') >>> tree = fromstring(r.content) >>> tree.findtext('.//title') u'Friends (TV Series 1994\u20132004) - IMDb' </code></pre> There are certainly other options, like, for example, <code>mechanize</code> library: <pre class="prettyprint"><code>>>> import mechanize >>> br = mechanize.Browser() >>> br.open('http://www.imdb.com/title/tt0108778/') >>> br.title() 'Friends (TV Series 1994\xe2\x80\x932004) - IMDb' </code></pre> What option to choose depends on what are you going to do next: parse the page to get more data, or, may be, you want to interact with it: click buttons, submit forms, follow links etc. Besides, you may want to use an API provided by <code>IMDB</code>, instead of going down to HTML parsing, see: <ul> <li>Does IMDB provide an API?</li> <li>IMDbPY</li> </ul> Example usage of an <code>IMDbPY</code> package: <pre class="prettyprint"><code>>>> from imdb import IMDb >>> ia = IMDb() >>> movie = ia.get_movie('0108778') >>> movie['title'] u'Friends' >>> movie['series years'] u'1994-2004' </code></pre>

How to get page title in requests

Q: How do I exclude the about page when listing pages?

$content = "Hello World!"; This example will return the $page object for the page titled “About”. Then the $page->ID element is used to exclude the About page when listing pages. You must log in before being able to contribute a note or feedback.

Tags:

What would be the simplest way to get the title of a page in Requests?

r = requests.get('http://www.imdb.com/title/tt0108778/') # ? r.title Friends (TV Series 1994–2004) - IMDb

793

asked Nov 08 '14 00:11

David542

1 Answers

You need an HTML parser to parse the HTML response and get the title tag's text:

Example using lxml.html:

>>> import requests >>> from lxml.html import fromstring >>> r = requests.get('http://www.imdb.com/title/tt0108778/') >>> tree = fromstring(r.content) >>> tree.findtext('.//title') u'Friends (TV Series 1994\u20132004) - IMDb'

There are certainly other options, like, for example, mechanize library:

>>> import mechanize >>> br = mechanize.Browser() >>> br.open('http://www.imdb.com/title/tt0108778/') >>> br.title() 'Friends (TV Series 1994\xe2\x80\x932004) - IMDb'

What option to choose depends on what are you going to do next: parse the page to get more data, or, may be, you want to interact with it: click buttons, submit forms, follow links etc.

Besides, you may want to use an API provided by IMDB, instead of going down to HTML parsing, see:

Does IMDB provide an API?
IMDbPY

Example usage of an IMDbPY package:

>>> from imdb import IMDb >>> ia = IMDb() >>> movie = ia.get_movie('0108778') >>> movie['title'] u'Friends' >>> movie['series years'] u'1994-2004'

176

answered Oct 07 '22 06:10

alecxe

Related questions
                            
                                GLES10.glGetIntegerv returns 0 in Lollipop only
                            
                                Project not Built in Active Configuration - Xamarin/Xamarin.Form
                            
                                identifying phase shift between signals
                            
                                Why is the Android emulator screen blank?
                            
                                Django REST Framework (DRF): Set current user id as field value
                            
                                Cannot find symbol class in Android Studio
                            
                                Convert a List into an Option if it is populated
                            
                                AOSP repo sync takes too long
                            
                                MongoDB Duplicate Documents even after adding unique key
                            
                                How to "negative select" columns in spark's dataframe
                            
                                Spark: writing DataFrame as compressed JSON
                            
                                Excel's fullname property with OneDrive

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With