I would like to download a large pdf file with jsoup. I have try to change timeout and maxBodySize but the largest file I could download was about 11MB. I think if there is any way to do something like buffering. Below is my code.
public class Download extends Activity {
static public String nextPage;
static public Response file;
static public Connection.Response res;
@Override
protected void onCreate(Bundle savedInstanceState) {
// TODO Auto-generated method stub
super.onCreate(savedInstanceState);
Bundle b = new Bundle();
b = getIntent().getExtras();
nextPage = b.getString("key");
new Login().execute();
finish();
}
private class Login extends AsyncTask<Void, Void, Void> {
@Override
protected void onPreExecute() {
super.onPreExecute();
}
@Override
protected Void doInBackground(Void... params) {
try {
res = Jsoup.connect("http://www.eclass.teikal.gr/eclass2/")
.ignoreContentType(true).userAgent("Mozilla/5.0")
.execute();
SharedPreferences pref = getSharedPreferences(
MainActivity.PREFS_NAME, MODE_PRIVATE);
String username1 = pref.getString(MainActivity.PREF_USERNAME,
null);
String password1 = pref.getString(MainActivity.PREF_PASSWORD,
null);
file = (Response) Jsoup
.connect("http://www.eclass.teikal.gr/eclass2/")
.ignoreContentType(true).userAgent("Mozilla/5.0")
.maxBodySize(1024*1024*10*2)
.timeout(70000*10)
.cookies(res.cookies()).data("uname", username1)
.data("pass", password1).data("next", nextPage)
.data("submit", "").method(Method.POST).execute();
} catch (IOException e) {
e.printStackTrace();
}
return null;
}
@Override
protected void onPostExecute(Void result) {
String PATH = Environment.getExternalStorageDirectory()
+ "/download/";
String name = "eclassTest.pdf";
FileOutputStream out;
try {
int len = file.bodyAsBytes().length;
out = new FileOutputStream(new File(PATH + name));
out.write(file.bodyAsBytes(),0,len);
out.close();
} catch (FileNotFoundException e) {
e.printStackTrace();
} catch (IOException e) {
e.printStackTrace();
}
}
}
}
I hope somebody could help me!
jsoup is a Java library for working with real-world HTML. It provides a very convenient API for fetching URLs and extracting and manipulating data, using the best of HTML5 DOM methods and CSS selectors. jsoup implements the WHATWG HTML5 specification, and parses HTML to the same DOM as modern browsers do.
jsoup can parse HTML files, input streams, URLs, or even strings. It eases data extraction from HTML by offering Document Object Model (DOM) traversal methods and CSS and jQuery-like selectors. jsoup can manipulate the content: the HTML element itself, its attributes, or its text.
Jsoup is a java html parser. It is a java library that is used to parse HTML document. Jsoup provides api to extract and manipulate data from URL or HTML file. It uses DOM, CSS and Jquery-like methods for extracting and manipulating file.
Jsoup is a Java html parser. It is a Java library that is used to parse html documents. Jsoup gives programming interface to concentrate and control information from URL or HTML documents. It utilizes DOM, CSS and Jquery-like systems for concentrating and controlling records.
I think, it's better to download any binary file via HTTPConnection:
InputStream input = null;
OutputStream output = null;
HttpURLConnection connection = null;
try {
URL url = new URL("http://example.com/file.pdf");
connection = (HttpURLConnection) url.openConnection();
connection.connect();
// expect HTTP 200 OK, so we don't mistakenly save error report
// instead of the file
if (connection.getResponseCode() != HttpURLConnection.HTTP_OK) {
return "Server returned HTTP " + connection.getResponseCode()
+ " " + connection.getResponseMessage();
}
// this will be useful to display download percentage
// might be -1: server did not report the length
int fileLength = connection.getContentLength();
// download the file
input = connection.getInputStream();
output = new FileOutputStream("/sdcard/file_name.extension");
byte data[] = new byte[4096];
int count;
while ((count = input.read(data)) != -1) {
output.write(data, 0, count);
}
} catch (Exception e) {
return e.toString();
} finally {
try {
if (output != null)
output.close();
if (input != null)
input.close();
} catch (IOException ignored) {
}
if (connection != null)
connection.disconnect();
}
Jsoup is for parsing and loading HTML pages, not binary files.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With