Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cannot fetch PDF file as binary data

I'm trying to fetch a PDF file from:

URL : https://domain_name/xyz/_id/download/

wherein it doesn't points to a direct pdf file and each unique file gets downloaded interpreting a particular <_id> field.

I put this link in the address bar of the browser and Pdf file gets downloaded instantly, while when I try to fetch it by HTTPsURLConnection its Content-Type is in 'text/html' form, while it should be in 'application/pdf'.

I also tried to 'setRequestProperty' to 'application/pdf' before connecting but file always get downloaded in 'text/html' form.

Method I'm using for it is 'GET'

1) Do I need to use HttpClient instead of HttpsURLConnection?

2) Are these type of links used to increase security?

3) Please point my mistakes out.

4) How can I know the filename present on the server?

I'm pasting below main codes that I've implemented:

    URL url = new URL(sb.toString());

    //created new connection
    HttpsURLConnection urlConnection = (HttpsURLConnection) url.openConnection();

    //have set the request method and property
    urlConnection.setRequestMethod("GET");
    urlConnection.setDoOutput(true);
    urlConnection.setRequestProperty("Content-Type", "application/pdf");

    Log.e("Content Type--->", urlConnection.getContentType()+"   "+ urlConnection.getResponseCode()+"  "+ urlConnection.getResponseMessage()+"              "+urlConnection.getHeaderField("Content-Type"));

    //and connecting!
    urlConnection.connect();

    //setting the path where we want to save the file
    //in this case, going to save it on the root directory of the
    //sd card.
    File SDCardRoot = Environment.getExternalStorageDirectory();

    //created a new file, specifying the path, and the filename

    File file = new File(SDCardRoot,"example.pdf");

    if((Environment.getExternalStorageState()).equals(Environment.MEDIA_MOUNTED_READ_ONLY))

    //writing the downloaded data into the file we created
    FileOutputStream fileOutput = new FileOutputStream(file);

    //this will be used in reading the data from the internet
    InputStream inputStream = urlConnection.getInputStream();

    //this is the total size of the file
    int totalSize = urlConnection.getContentLength();

    //variable to store total downloaded bytes
    Log.e("Total File Size ---->", ""+totalSize);
    int downloadedSize = 0;

    //create a buffer...
    byte[] buffer = new byte[1024];
    int bufferLength = 0; //used to store a temporary size of the buffer

    //Reading through the input buffer and write the contents to the file
    while ( (bufferLength = inputStream.read(buffer)) > 0 ) {

        //add the data in the buffer to the file in the file output stream (the file on the sd card
        fileOutput.write(buffer, 0, bufferLength);


        //adding up the size
        downloadedSize += bufferLength;

        //reporting the progress:
        Log.e("This much downloaded---->",""+ downloadedSize);

    }
    //closed the output stream
    fileOutput.close();

I have searched a lot and couldn't get the result. If possible please try to elaborate my mistake as I'm implementing this thing for the first time.

**Tried fetching direct pdf links like: http://labs.google.com/papers/bigtable-osdi06.pdf and they get downloaded easily, moreover their 'Content-Type' was also 'application/pdf' **

Thanks.

like image 828
abhy Avatar asked Mar 10 '11 06:03

abhy


People also ask

Is PDF file a binary file?

PDF files are either 8-bit binary files or 7-bit ASCII text files (using ASCII-85 encoding). Every line in a PDF can contain up to 255 characters.

How do I extract binary data from a file?

The binary data of a local file selected by the user can be retrieved using the readAsBinaryString() method of a FileReader object.


2 Answers

This thread led me to the solution for my problem! When you try to download a streamed PDF from the WebView and you use a HttpURLConnection you need to also pass the cookies from within the Webview.

String cookie = CookieManager.getInstance().getCookie(url.toString());
if (cookie != null) connection.setRequestProperty("cookie", cookie);
like image 103
Predders Avatar answered Oct 03 '22 03:10

Predders


Theory 1 : The server is responding with incorrect Content type in response. If server code is written and deployed by you check that.

Theory 2 : The url is returning an html page which has some javascript in it which redirects page to the url of the actual pdf file.

like image 39
Nishan Avatar answered Oct 03 '22 02:10

Nishan