Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Jsoup 1.7.1 bug?

Tags:

java

jsoup

Execution of the following code:

Jsoup.connect(baseURL + dataJSSrc).execute();

throws an Exception:

org.jsoup.UnsupportedMimeTypeException: Unhandled content type. Must be text/*, application/xml, or application/xhtml+xml. Mimetype=application/x-javascript, URL=http://www.abc.com/playdata/206/8910.js?44613.77

but when I use

URLConnection conn = new URL(baseURL + dataJSSrc).openConnection();

it is OK!

in the following code

System.out.println(conn.getContentType()); // out put 'application/x-javascript'

Can Jsoup only be used to download HTML or XML?

like image 395
user1707683 Avatar asked Mar 17 '26 09:03

user1707683


1 Answers

Whilst I don't disagree with BalusC's answer, you can use Jsoup to download anything you like. By default, Jsoup will throw an exception if it retrieves content with a mime type that it will not be able to parse as HTML, to avoid parsing e.g. images. However you can disable that test with connection.ignoreContentType(true) if you just want to get at the bytes or as a string:

String script = Jsoup.connect(jsUrl).ignoreContentType(true).execute().body();

or

byte[] bytes = Jsoup.connect(imageUrl).ignoreContentType(true).execute().bodyAsBytes();

You will get more control with a full-fledged HTTP client, but this method can be useful in a pinch.

like image 170
Jonathan Hedley Avatar answered Mar 18 '26 23:03

Jonathan Hedley



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!