Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to know with java whether file is corrupted (readable) or not?

I have web application where person can upload any pdf via FTP. After pdf file get uploaded I perform certain operations over that pdf.

But the problem here is, while uploading the PDF via FTP sometimes connection breaks up in between and the pdf uploaded is not complete (act like corrupted one). When I try to open that document in arobat reader it gives message 'There was an error opening the document. The file is damaged and could not be repaired'.

Now before starting processing over PDF, I want to check whether pdf uploaded is readable means no corrupted.

Do java provide any API for that, or there is any method to check whether file is corrupted or not.

like image 634
Dhruv Bansal Avatar asked May 07 '12 05:05

Dhruv Bansal


People also ask

How do you check if a file has been corrupted?

Look at the file size. Right-click on the file and choose "Properties." You will see the file size in the Properties. Compare this to another version of the file or a similar file if you have one. If you have another copy of the file and the file you have is smaller, then it may be corrupt.

How do you check if a zip file is corrupted in Java?

Your code is basically OK, try to find out which file is responsible for the corrupted zip file. Check whether digitalFile. getFile() always returns a valid and accessible argument to FileInputStream. Just add a bit logging to your code and you will find out what's wrong.

What indicates a corrupt file?

A data or program file that has been altered accidentally by hardware or software failure or on purpose by an attacker. Because the bits are rearranged, a corrupted file is either unreadable to the hardware or, if readable, indecipherable to the software.


1 Answers

We have iText API in Java to work on PDF files.

To check if a PDF file is valid to load and read, use com.itextpdf.text.pdf.PdfReader.
If the file is corrupted, an exception like com.itextpdf.text.exceptions.InvalidPdfException, is thrown.

Sample code snippet:

...  
import com.itextpdf.text.pdf.PdfReader;  
...  
try {  
    PdfReader pdfReader = new PdfReader( pathToUploadedPdfFile );  

    String textFromPdfFilePageOne = PdfTextExtractor.getTextFromPage( pdfReader, 1 ); 
    System.out.println( textFromPdfFilePageOne );
}  
catch ( Exception e ) {  
    // handle exception  
}  

In case of uploaded but corrupted files, you may face the following error:

com.itextpdf.text.exceptions.InvalidPdfException: Rebuild failed:   
  trailer not found.; Original message: PDF startxref not found.  

Note: To produce such an exception, try saving a pdf file from net, but abort it in the middle.
Use it to load through above code snippet and check if it is loaded safe.

You can find detailed examples on iText API at

Use Case Examples of iText PDF | iText.

like image 188
Ravinder Reddy Avatar answered Oct 03 '22 14:10

Ravinder Reddy