I'm trying to parse the frontpage of facebook with <code>JSoup</code> but I always get the HTML Code for mobile devices and not the version for normal browsers(In my case Firefox 5.0). I'm setting my User Agent like this: <pre class="prettyprint"><code>doc = Jsoup.connect(url) .userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64; rv:5.0) Gecko/20100101 Firefox/5.0") .get(); </code></pre> Am I doing something wrong? EDIT: I just parsed http://whatsmyuseragent.com/ and it looks like the user Agent is working. Now its even more confusing for me why the site http://www.facebook.com/ returns a different version when using JSoup and my browser. Both are using the same useragent.... I noticed this behaviour on some other sites too now. If you could explain to me what the Issue is I would be more than happy.

<pre class="prettyprint"><code>Response response= Jsoup.connect(location) .ignoreContentType(true) .userAgent("Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:25.0) Gecko/20100101 Firefox/25.0") .referrer("http://www.google.com") .timeout(12000) .followRedirects(true) .execute(); Document doc = response.parse(); </code></pre> User Agent Use the latest User agent. Here's the complete list http://www.useragentstring.com/pages/useragentstring.php. Timeout Also don't forget to add timout, since sometimes it takes more than normal timeout to download the page. Referer Set the referer as google. Follow redirects follow redirects to get to the page. execute() instead of get() Use execute() to get the Response object. Which can help you to check for content type and status codes incase of error. Later you can parse the response object to obtain the document.

JSoup UserAgent, how to set it right?

Tags:

jsoup

I'm trying to parse the frontpage of facebook with JSoup but I always get the HTML Code for mobile devices and not the version for normal browsers(In my case Firefox 5.0).

I'm setting my User Agent like this:

doc = Jsoup.connect(url)       .userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64; rv:5.0) Gecko/20100101 Firefox/5.0")       .get();

Am I doing something wrong?

EDIT:

I just parsed http://whatsmyuseragent.com/ and it looks like the user Agent is working. Now its even more confusing for me why the site http://www.facebook.com/ returns a different version when using JSoup and my browser. Both are using the same useragent....

I noticed this behaviour on some other sites too now. If you could explain to me what the Issue is I would be more than happy.

255

asked Jul 05 '11 11:07

Markus

2 Answers

You might try setting the referrer header as well:

doc = Jsoup.connect("https://www.facebook.com/")       .userAgent("Mozilla/5.0 (Windows; U; WindowsNT 5.1; en-US; rv1.8.1.6) Gecko/20070725 Firefox/2.0.0.6")       .referrer("http://www.google.com")       .get();

answered Sep 23 '22 13:09

Denaitre Roux

Response response= Jsoup.connect(location)            .ignoreContentType(true)            .userAgent("Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:25.0) Gecko/20100101 Firefox/25.0")              .referrer("http://www.google.com")               .timeout(12000)             .followRedirects(true)            .execute();  Document doc = response.parse();

User Agent

Use the latest User agent. Here's the complete list http://www.useragentstring.com/pages/useragentstring.php.

Timeout

Also don't forget to add timout, since sometimes it takes more than normal timeout to download the page.

Referer

Set the referer as google.

Follow redirects

follow redirects to get to the page.

execute() instead of get()

Use execute() to get the Response object. Which can help you to check for content type and status codes incase of error.

Later you can parse the response object to obtain the document.

answered Sep 23 '22 13:09

Sorter

Related questions
                            
                                Jsoup - getting class element with whitespace in name
                            
                                How to extract separate text nodes with Jsoup?
                            
                                Jsoup like html parser for C++ [closed]
                            
                                JSoup.connect throws 403 error while apache.httpclient is able to fetch the content
                            
                                Is it possible to convert HTML into XHTML with Jsoup 1.8.1?
                            
                                How to connect via HTTPS using Jsoup?
                            
                                How do I select this element in JSOUP?
                            
                                How to POST Data into website using Jsoup
                            
                                How do I convert a document made in Jsoup (the Java html parser) into a string
                            
                                Jsoup.clean without adding html entities
                            
                                Jsoup select and iterate all elements
                            
                                Jsoup select div having multiple classes
                            
                                Does jsoup support xpath?
                            
                                (how) can I download an image using JSoup?
                            
                                How to parse HTML table using jsoup?
                            
                                How to parse XML with jsoup
                            
                                Connection error: "org.jsoup.UnsupportedMimeTypeException: Unhandled content type"
                            
                                Jsoup: how to get an image's absolute url?
                            
                                How to parse data in Talend with Java (coming from a previously produced .txt file)?
                            
                                Page content is loaded with JavaScript and Jsoup doesn't see it

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With