The PDF Accessibility Checker: Strengths and Weaknesses
From invoices for online shopping to official notices: PDF documents are ubiquitous today. Numerous companies and organizations use the file format to display and exchange information. But are PDFs also accessible? The short answer: It depends!
Photo: © ANTONI SHKRABA production / pexels.com
The PDF/UA standard was published in 2012. It defines how a PDF document can meet accessibility requirements. Unfortunately, even today many documents do not comply with this standard.
To prevent this from happening to you, you should check your PDFs for accessibility. The audit is best started with the PDF Accessibility Checker, or PAC for short. This free standard tool has recently been released in version 2024: https://pac.pdf-accessibility.org/en
I have been using the tool for several years now and appreciate its strengths. However, the automated tool does not replace a manual check, for example with a screen reader and keyboard. I will therefore give you a brief overview of the strengths and weaknesses of the PDF Accessibility Checker.
The tool's strengths
The PAC is a tool for automatically checking PDF documents for accessibility in accordance with the PDF/UA standard (DIN/ISO 14289-1). In addition, relevant points of the Web Content Accessibility Guidelines (WCAG) and additional quality features are checked.
The PAC can automatically detect the following barriers, among others:
- No title is set for the PDF document.
- The language of the document is not defined.
- The contents of the PDF document are not tagged.
- An image does not have an alternative text.
Furthermore, the tool also transparently shows which guidelines it checks and which it does not in the WCAG tab. Guidelines such as “2.1 Keyboard Accessible” and “3.2 Predictable” can only be checked manually.
The tool's weaknesses
The PDF Accessibility Checker can check the code of a PDF document according to certain rules. However, it cannot understand and interpret the content of the document. Automated checking tools generally cannot do this.
To show the tool's weaknesses, I conducted an experiment: How many barriers can I include in a PDF document without the PAC sounding the alarm? My test document consists of several headings, body text and an image. You can download the document and audit it yourself.
Let's now take a closer look at the individual barriers.
No descriptive title
My test document has the title “A Fabulous Title”. PAC is satisfied with this. However, when screen reader users hear this title, they won't be very happy about it.
The title of a website or document is the first piece of information that users come across. The title should describe the content and provide orientation for users. The relevant WCAG success criterion 2.4.2 states:
Web pages have titles that describe topic or purpose.
Therefore, during a manual check, a human must assess whether the set title of the PDF document is meaningful enough.
Incorrectly tagged headings
As PAC correctly states, the entire content of my test document is tagged. This means that for each element in the PDF, the type of content it represents is defined. This information is essential for screen reader users, among others, in order to correctly understand the content and quickly navigate through the document.
However, I have fooled PAC: All headings, such as “A fabulous heading”, are marked as body text with the <P> tag. While sighted people can visually recognize the headings as such, this information is not captured semantically. The WCAG success criterion 1.3.1 requires:
Information, structure, and relationships conveyed through presentation can be programmatically determined or are available in text.
Automated tools such as the PAC cannot assess what should be tagged as a heading and what should not. Therefore, a person should have the PDF document read aloud by a screen reader. In addition, PAC itself also offers the useful screen reader preview function, which visualizes the tagging structure.
No meaningful sequence
Take another close look at the screenshot above. Do you notice anything? Exactly, the image is visually in a different position than in the screen reader preview.
The reading order is saved separately in a tagged PDF. This allows a logical reading order to be defined, especially for documents with complex layouts. For example, when blind users navigate through the document with a screen reader, the virtual cursor of the screen reader follows this sequence.
Therefore, WCAG success criterion 1.3.2 defines the following accessibility requirement:
When the sequence in which content is presented affects its meaning, a correct reading sequence can be programmatically determined.
During a manual check, a person with an activated screen reader should therefore navigate sequentially through the document and consider whether the sequence of content makes sense. The visual presentation serves as a guideline here. However, a different reading order may also be meaningful.
Alternative text does not describe the image
The PDF Accessibility Checker only checks images to see whether they are tagged correctly and whether an alternative text is defined. However, the WCAG success criterion 1.1.1 requires more:
All non-text content that is presented to the user has a text alternative that serves the equivalent purpose, except for the situations listed below. [...]
The picture in my test document shows the blue, open sea off the coast of Mallorca. The alternative text of the image, however, reads “A dust-dry desert”. An automated tool cannot recognize that this is nonsense.
The PDF Accessibility Checker is a great tool for auditing a PDF document for major violations. However, it does not replace manual testing. A comprehensive accessibility audit should include both automated and manual steps.
PS: Also test the corrected version of my test document with a screen reader to experience the improved user experience for yourself.