PDF/A is an ISO-standardized version of PDF designed specifically for long-term archiving and preservation of electronic documents.
The goal of PDF/A is simple:
A PDF/A document created today should be viewable and reproducible many years in the future without depending on external software, fonts, or resources.
PDF/A is widely used by governments, courts, healthcare organizations, financial institutions, libraries, and records-management systems.
Why PDF/A Exists
A normal PDF may contain:
External font references
Audio/video content
JavaScript
Encryption
External hyperlinks to required content
Dynamic forms
Embedded executables
These features may stop working in the future.
PDF/A eliminates or restricts such features to ensure the document remains self-contained.
ISO Standards
PDF/A is defined by several ISO standards:
| Standard | Based On | Description |
|---|---|---|
| PDF/A-1 | PDF 1.4 | First archival standard |
| PDF/A-2 | PDF 1.7 | More advanced features |
| PDF/A-3 | PDF 1.7 | Allows arbitrary file attachments |
| PDF/A-4 | PDF 2.0 | Latest generation |
| PDF/A-4e | PDF 2.0 | Engineering documents |
| PDF/A-4f | PDF 2.0 | File attachments |
Core Requirements
A valid PDF/A document must be:
Self-Contained
Everything required for rendering must be embedded.
Fonts
All fonts must be embedded.
Not allowed:
Missing fonts
System-dependent fonts
Allowed:
Embedded TrueType
Embedded OpenType
Device Independent
Colors must be reproducible.
Requires:
ICC color profile
Defined color spaces
Example:
sRGB IEC61966-2.1
Metadata Required
PDF/A requires XMP metadata.
Example:
<x:xmpmeta>
<rdf:RDF>
<rdf:Description>
<dc:title>Invoice</dc:title>
<dc:creator>Company</dc:creator>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
What PDF/A Prohibits
Encryption
Forbidden.
Not allowed:
Password protection
Owner password
AES encryption
Reason:
Future users may not know the password.
JavaScript
Forbidden.
Not allowed:
app.alert("Hello");
Reason:
Execution environments may disappear.
Audio and Video
Forbidden in PDF/A-1.
Examples:
MP3
MP4
AVI
Executable Content
Forbidden.
Examples:
EXE files
Scripts
Launch actions
External Dependencies
Forbidden.
Examples:
External fonts
External images
External color profiles
Everything must be embedded.
PDF/A Conformance Levels
Level A
Accessible
Requirements:
Tagged PDF
Logical reading order
Unicode mapping
Best for:
Government
Accessibility compliance
Example:
PDF/A-2A
PDF/A-3A
Level B
Basic Visual Preservation
Only guarantees:
Visual appearance preservation
No accessibility requirements.
Most common level.
Example:
PDF/A-1B
PDF/A-2B
PDF/A-3B
Level U
Unicode Mapping
Adds:
Unicode text extraction
Example:
PDF/A-2U
PDF/A-3U
PDF/A-1
ISO 19005-1
Based on:
PDF 1.4
Advantages:
Maximum compatibility
Restrictions:
No transparency
No JPEG2000
No embedded files
Typical conformance:
PDF/A-1B
PDF/A-2
ISO 19005-2
Based on:
PDF 1.7
Adds support for:
Transparency
JPEG2000
Layers (Optional Content Groups)
Digital signatures
Compressed object streams
Conformance:
PDF/A-2A
PDF/A-2B
PDF/A-2U
PDF/A-3
ISO 19005-3
Most discussed version.
Adds:
Embedded Files
Any file type may be attached.
Examples:
XML
XLSX
DOCX
CSV
ZIP
CAD files
The PDF itself remains archival compliant.
Example
Electronic invoice:
Invoice.pdf
Contains attached:
invoice.xml
This is common for:
e-Invoicing
ZUGFeRD
Factur-X
PEPPOL workflows
PDF/A-3 Attachment Rules
Attachments must:
Be declared in metadata
Have relationship information
Be properly referenced
Examples of relationships:
| Relationship | Meaning |
|---|---|
| Source | Original source |
| Data | Supporting data |
| Alternative | Alternative representation |
| Supplement | Additional information |
| Unspecified | Unknown |
Example:
AFRelationship="Data"
PDF/A-4
Newest standard.
Based on:
PDF 2.0
Advantages:
Simpler requirements
Better modern PDF support
Improved signatures
Better metadata model
PDF/A-4 Variants
PDF/A-4
No attachments required.
PDF/A-4e
Engineering documents.
Supports:
3D content
CAD workflows
Used in manufacturing.
PDF/A-4f
Supports file attachments.
Successor to many PDF/A-3 attachment use cases.
Digital Signatures
Allowed beginning with PDF/A-2.
Common workflow:
Create PDF/A document
Validate PDF/A compliance
Apply digital signature
The signature must not break compliance.
Metadata in PDF/A
Required metadata fields often include:
| Field | Purpose |
|---|---|
| Title | Document title |
| Author | Creator |
| Subject | Description |
| Keywords | Searchability |
| CreationDate | Creation timestamp |
| ModifyDate | Modification timestamp |
Stored as:
XMP metadata
Embedded XML
File Attachments in PDF/A-3
Example structure:
Invoice.pdf
├── Visible PDF
└── invoice.xml
The XML can contain:
Machine-readable invoice data
Audit information
Accounting records
This approach allows:
Humans read PDF
Software processes XML
Validation
A PDF is not PDF/A merely because it has ".pdf".
Validation is required.
Validation checks:
Fonts embedded
Metadata present
ICC profiles
Structure compliance
Forbidden features absent
Common Conversion Problems
Missing Fonts
Error:
Font not embedded
Fix:
Embed all fonts.
RGB Without ICC Profile
Error:
Color space not defined
Fix:
Embed sRGB ICC profile.
JavaScript Present
Error:
Forbidden action detected
Fix:
Remove JavaScript.
Encryption Enabled
Error:
Encrypted PDF cannot be PDF/A
Fix:
Remove security settings.
PDF/A and Attachments
A common misconception:
PDF/A does not allow attachments.
This is only true for PDF/A-1.
| Version | Attachments |
|---|---|
| PDF/A-1 | No |
| PDF/A-2 | Limited embedded files |
| PDF/A-3 | Any file type |
| PDF/A-4f | Any file type |
Therefore:
PDF/A-1 + attachment = Invalid
PDF/A-3 + XML attachment = Valid
PDF/A-3 + DOCX attachment = Valid
PDF/A-4f + ZIP attachment = Valid
PDF/A vs Standard PDF
| Feature | PDF/A | |
|---|---|---|
| Long-term archiving | No guarantee | Yes |
| Encryption | Yes | No |
| JavaScript | Yes | No |
| Audio/Video | Yes | Restricted |
| Embedded fonts | Optional | Required |
| ICC profile | Optional | Required |
| Metadata | Optional | Required |
| File attachments | Yes | Depends on version |
| Digital signatures | Yes | Yes (A-2+) |
Recommended Usage
| Scenario | Recommendation |
|---|---|
| Government archives | PDF/A-2U or PDF/A-4 |
| Scanned documents | PDF/A-2B |
| Legal records | PDF/A-2U |
| Electronic invoices | PDF/A-3B |
| Factur-X/ZUGFeRD | PDF/A-3B |
| Engineering drawings | PDF/A-4e |
| Documents with attachments | PDF/A-3 or PDF/A-4f |
| Long-term enterprise storage | PDF/A-4 |
Summary
PDF/A is the archival version of PDF intended for decades-long preservation. It requires embedded fonts, color profiles, and metadata while prohibiting features that depend on external resources or future software support. The most important distinction among versions is attachment support:
PDF/A-1: No file attachments.
PDF/A-2: Modern PDF features such as transparency and JPEG2000.
PDF/A-3: Any file type can be embedded as an attachment.
PDF/A-4 / PDF/A-4f: Based on PDF 2.0 and represents the current generation of archival PDF standards.
For modern document workflows that require attached XML, Excel, or other source files, PDF/A-3 and PDF/A-4f are typically the preferred choices.
Comments
Post a Comment