Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PDF specifications for coders: Adobe or ISO?

I want to code an application that can read and decode a pdf document; now where I'm supposed to get the specs for this fileformat ? The PDF format is standardized from the ISO group but it's not clear to me where is the most reliable source for getting this kind of informations.

what is a good source to start with this file format ?

like image 422
user1824407 Avatar asked Jan 01 '13 15:01

user1824407


People also ask

Is PDF an ISO standard?

Defined by an ISO standard since 2007, the world of PDF technology now includes a multitude of ISO standards, testimony to the breadth and depth of the format's reach throughout the global economy.

What version of PDF should I use?

The latest version includes all the newest features and functionality; however, if you're creating documents that will be distributed widely, consider choosing Acrobat 6.0 or Acrobat 7.0. Using one of these versions ensures that all users can view and print your document.

How are PDFs coded?

PDF character encoding determines the character set that is used to create PDF files. You can choose to use Windows1252 encoding, the standard Microsoft Windows operating system single-byte encoding for Latin text in Western writing systems, or unicode (UTF-16) encoding.

What is PDF ISO?

Portable Document Format (PDF), standardized as ISO 32000, is a file format developed by Adobe in 1992 to present documents, including text formatting and images, in a manner independent of application software, hardware, and operating systems.


1 Answers

You can actually use both sources you mentioned; the confusion is historical.

Adobe invented PDF and it invented the Acrobat product family to be used together with it. The different PDF versions were released together with major Acrobat versions (PDF 1.3 for example was released together with Acrobat 4).

Because of the adoption of the PDF format and because a number of ISO standards were written that actually depended on the proprietary PDF file format (not an easy thing for an ISO standard), Adobe decided to hand over the PDF format to ISO.

From that point on and until today there is an ISO committee responsible for editing the PDF specification and coming up with new versions. The ISO standard for PDF is ISO 32000.

Also, keep in mind that, depending on where you want to use PDF, a number of other ISO standards might be very useful or indispensable. Amongst the most commonly used are PDF/X (for exchange of PDF files in the publishing community) and PDF/A (for the creation of PDF files that need to be archived in long-term storage). These specifications reference a specific version of the PDF standard and add additional requirements and restrictions.

As far as the specification is concerned, you can get all documents from the ISO directly. However, for PDF itself you can also get it from Adobe and that document will be identical. Refer to the Adobe DevNet site on Acrobat:

http://www.adobe.com/devnet/acrobat.html

Just download the Acrobat SDK and that will give you the documentation as part of it.

Let me add a word of caution on "targeting the PDF specification" in code. I really, really, really advise you to more clearly specify exactly what your needs are for PDF (editing, generating, quality control (preflight)) and then look for or ask about an existing library that meets those needs or can be extended to meet your needs.

Writing something that supports "PDF" in general will be a daunting task. The PDF specification is large, intricate and full of... well... niceties. There be dragons!


Update:

Direct link to Adobe's PDF-1.7 specification document (first edition, free to download, is here:

  • Document management — Portable document format — Part 1: PDF 1.7

The content of this document later became officially adopted as the ISO standard for general PDF, ISO 32000-1.

Note however, that there are a few differences to the PDF file available from ISO:

  • The page layout changed, compared to Adobe's version.
  • ISO documents are not available for free (this one costs you in Swiss Francs CHF 238.- to download).

If you start developing PDF software, it is sufficient to have (free) PDF from above Adobe link around.


Update: 2021

It's worth noting that ISO meanwhile released a new version of the PDF specification, called ISO 32000-2. Information about this on the ISO site. This new version was published in 2017 and received an update in December 2020.

While the document does not dramatically alter PDF, and most of the general information about PDF from for example the free Adobe version of the specification will still be correct, there are definitely changes:

  • Many things, especially deeply technical things such as everything on transparency, received an update, mostly to clarify existing language (and add information that was up to now more or less implicit). These updates may have an effect on how to implement or use those parts of the standard.
  • New features have been included in the standard.

If you're writing PDF files, especially more simple ones, the Adobe specification should still be OK to get you going. If you want to support everything in the PDF standard, you'll need to pay for the latest ISO version (but that is a tall order anyway).

like image 100
David van Driessche Avatar answered Oct 11 '22 17:10

David van Driessche