So you’ve probably heard the term PDF/A tossed around when discussing digitization output. In fact, you may have seen it mentioned right here on our website. But what is PDF/A? Should you use it in your projects or recommend it to your customers?
Well, here is a crash course in PDF/A. Think of this as PDF/A 101. All the basics to get you started and maybe even inspire you to dig a bit deeper.
PDF/A is an ISO (International Organization for Standardization)-standardized version of the PDF specifically for digital preservation of electronic documents.
First, here’s a little history. In 2002, specialist from libraries and archives, from administrative bodies, from industry and from the judicial system assembled in order to develop a purpose-built file format for standardized archiving.
Within the ISO a group was formed to meet the task. This group consisted of representatives from a wide range of US-based associations and federal authorities including AIIM (Association for Information and Image Management), NPES (Association for Suppliers of Printing, Publishing and converting Technologies) and NARA (National Archives and Records Administration) who met with experts from the library sector (Library of Congress and Harvard University Libraries), industry developers (including Adobe Systems and Kodak) and the judicial system (Administrative Office of the United States Courts).
As a result of these meetings, the ISO published the PDF/A specifications on 10/01/2005 under the designation ISO 190005-1:2005. This was the world’s first standard file format for digital long-term archiving.
And now a quick look at the first guidelines. PDF/A should be:
- Device-/Software-/ Version independent:
- Identical reproduction of content and documents
- Self-Contained: a PDF/A contains everything that is needed for the safe reproduction and presentation
- Self-Documented: a PDF/A file describes and documents itself (metadata)
- Transparent: A PDF/A file can be analyzed easily
What makes up the PDF/A standard? There are 3 main versions of PDF/A.
First is PDF/A-1, which is defined by ISO 19005-1:2005. It is based on the PDF Reference Version 1.4 and it aims to ensure reliable reproduction of the visual appearance of the document as well as guarantee that document content can be searched and repurposed.
Next, PDF/A-2 was introduced in July of 2011 and was published on top of ISO PDF (32000) and PDF Reference Version 1.7. It added JPEG2000 image compression, support for transparency effects and layers, embedding of OpenType fonts, provisions for digital signatures and the option of embedding PDF/A files to facilitate archiving sets of documents with a single file.
Lastly, the PDF/A-3 specification was published on 10/17/2012 as ISO 19005-3.The only change from PDF/A-2 is that it allows embedding of arbitrary file formats (e.g. video, xml, csv, CAD, Word documents, spreadsheets) into PDF/A conforming documents.
So, that’s PDF/A in a nutshell. There is still so much more information to discover about this “digital paper” and the benefits of using it as your long term archiving solution. Check out www.pdfa.org for more articles, videos, and even events surrounding PDF/A.