Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Detect if PDF file is correct (header PDF) [closed]

Tags:

I have a windows .NET application that manages many PDF Files. Some of the files are corrupt.

2 issues: I'll try to explain in my imperfect English...sorry

1.)

How can I detect if any pdf file is correct ?

I want to read header of PDF and detect if it is correct.

var okPDF = PDFCorrect(@"C:\temp\pdfile1.pdf");

2.)

How to know if byte[] (bytearray) of file is PDF file or not.

For example, for ZIP files, you could examine the first four bytes and see if they match the local header signature, i.e. in hex

50 4b 03 04

if (buffer[0] == 0x50 && buffer[1] == 0x4b && buffer[2] == 0x03 && buffer[3] == 0x04)

If you are loading it into a long, this is (0x04034b50). by David Pierson

I want the same for PDF files.

byte[] dataPDF = ...

var okPDF = PDFCorrect(dataPDF);

Any sample source code in .NET?

like image 427
Kiquenet Avatar asked Jun 24 '10 08:06

Kiquenet


People also ask

Can you check if a PDF has been altered?

Yes; changes can be made to PDFs. If you're wondering whether a PDF sent for signature or review has been altered, Adobe Acrobat Pro DC's “Compare Files” tool can let you know for certain with a list of exactly what changes if any were made and when.

How can I tell if a PDF has been doctored?

If you go to the document properties of a PDF file (control or command d), if the proper metadata is available, it will list the creation date and time and modified date and time. This can help you determine if a pdf file has been modified since creation. However, it is not foolproof.


1 Answers

I check Header PDF like this:

 public bool IsPDFHeader(string fileName)     {         byte[] buffer = null;         FileStream fs = new FileStream(fileName, FileMode.Open, FileAccess.Read);         BinaryReader br = new BinaryReader(fs);         long numBytes = new FileInfo(fileName).Length;         //buffer = br.ReadBytes((int)numBytes);         buffer = br.ReadBytes(5);          var enc = new ASCIIEncoding();         var header = enc.GetString(buffer);          //%PDF−1.0         // If you are loading it into a long, this is (0x04034b50).         if (buffer[0] == 0x25 && buffer[1] == 0x50             && buffer[2] == 0x44 && buffer[3] == 0x46)         {             return header.StartsWith("%PDF-");         }         return false;      } 
like image 68
Kiquenet Avatar answered Sep 22 '22 02:09

Kiquenet