What to expect
The evaluation and subsequent remediation of a PDF document is a multi-step process of automated and manual checks. This guide provides a high-level overview of metadata, tags, bookmarks, and general structure, as well as basic semantics and other best practices.
Viewing and modifying PDF tags is a feature exclusive to Adobe Acrobat Pro. While third-party alternatives are available — and likely offer similar functionality — I'm assuming you're using Acrobat Pro given its widespread popularity.
Automated tools such as PAC 2021 will also be used but can be skipped entirely if desired.
When reviewing a document, the first thing I check is the presence of appropriate meta data.
What is meta data?
Meta data is a set of data points that provides details or other information on a document such as its title, author, composition date, language, as well as a document's settings and options. Information found in meta data is also used by search engine indexing services and should therefore not be omitted.
In Adobe Acrobat, the meta data is found in the Document properties panel where there are several data points to validate or populate. To view the panel, open the File menu and select Properties from the dropdown list.
In the Description tab are the document's title, author, and keywords. The title should be unique and descriptive to the document, and is typically the same as the top-level heading. Should the document not contain a heading, a short descriptive text should be used.
The author is the original document author's name or organization rather than the person who's converted the document to PDF. While optional, the Subject and Keyword fields should be populated as they are used by search engine indexing services.
Initial view tab
In the Window options section is the option to either display the document's file name or title in the title bar. This should be set to display the title that was previously set in the Description tab.
If the document contains bookmarks, the Navigation Tab option should be set to Bookmarks Panel and Page. Otherwise, it can be left to Page only.
Critical to screen readers, the document's language is set in the Advanced tab. Ensure this field is populated either with the preset languages or by using the IANA language subtag such as en-CA or fr-CA.
Often overlooked, bookmarks are an invaluable navigation aid and should not be omitted.
The general rule of thumb is that every heading should have a matching bookmark. If your document contains more than a handful of headings, is divided into sections, or otherwise spans multiple pages, it would likely benefit from the addition of bookmarks.
Open the bookmarks panel and select individual bookmarks. Ensure they lead to the correct section in the document. If not, highlight the appropriate destination, right-click the bookmark and select Set Destination from the context menu.
Also ensure that the bookmarks are nested correctly. If not, individual entries can be dragged to the correct location in the tree.
Highlight the desired text, right-click, and select Add Bookmark from the context menu. A new entry will be created with the selected text as the bookmark's title.
Tags are semantic markers that provide the non-visual structure of a document and arguably the most important part of an accessible PDF. Every element is assigned a tag that provides information about the type of content enclosed within it.
In addition to providing structure, tags also serve as a reading order for a document. The tag tree is a sequential list of elements that screen readers follow.
The tag panel is not available by default and must be enabled in the Navigation Panes menu.
- Navigation Panes
The process of navigating through tags is referred to as walking the tag tree. Select the first element in the list and press the down arrow to move to the next element. As you move down the tree, ensure every element is tagged and accurately represents the content found within.
In some instances, it may be necessary to add, modify, or shuffle tags in the tag tree. Managing tags involves the Reading Order tool to select and tag elements.
If an element is tagged incorrectly it can be corrected by typing in the correct tag manually or by right-clicking the tag and selecting Properties from the context menu. In the Properties panel, the correct tag can be selected from a list.
Using the Reading Order tool, click and drag around an element to create a selection. Assign the selection a tag by pressing the appropriate button in the Reading Order panel.
When a document is not tagged or the tag tree was generated incorrectly, the easiest way to populate the document is with the auto-tagging tool in the Accessibility panel.
Page numbers found in footers must match the PDF reader's page numbering scheme. If the document displays "Page 8" in the footer, the expectation is that this matches with "Page 8" in the PDF reader's navigation as well.
Images, graphs, charts, and figures should provide alternative text versions for assistive tools. To verify if an element has alternate text, find the element's tag in the tree, right-click, and select Properties from the context menu.
Use of colour
Ensure your document does not rely on colour alone to convey information, and that the colours used provide enough contrast.
WCAG specifies that colours should meet a minimum contrast ratio of 4.5:1 for normal text, and 3:1 for larger text such as headings.
Tables should be simple, linear, and contain header cells and a caption where appropriate.
A table is considered linear when its data can be read from left to right, top to bottom with no split, merged, combined, or empty cells. Columns and rows have the same layout throughout and the data's formatting is regular and predictive.
Complex tables may need to be redesigned or split into multiple, smaller tables. Some tables may be converted to text format using lists and headings to delineate sections.
Tables must contain a row and/or column of header cells to provide context to their respective cells' data. To verify if a table is using header cells, locate the table in the tags panel and expand its tags. Table header cells will appear as <TH> elements, and data cells as <TD>.
While it can only perform high-level tests, the Accessibility Checker is a great tool for catching issues with alt text, missing tags, and various faults with the document's structure.
- Accessibility Check
The following checks aim to ensure a comprehensive assessment and are best left to experienced users.
The use of a screen reader will highlight potential issues that may have gone unnoticed in the tags or missed by the automated validation. Issues such as non-standard characters, a broken reading order, or inaccurate alternative text can easily be missed during other steps.
PDF Accessibility Checker
The PDF Accessibility Checker (PAC) by the PDF/UA Foundation is a free and powerful validation tool against the WCAG or Universal Accessibility standards.
Other common issues
Empty paragraphs used for spacing
In word processors, users often rely on the use of multiple spaces or carriage returns to create space. The practice generates empty paragraphs which some screen readers announce as "blank" and should be removed from the tree.
Charts and graphs
When a chart of graph is present, a text version of the data must also be provided. One method is to provide a synopsis of the data in the chart's alternative text and provide a table or full description of the data after.
Tagged PDF Best Practice Guide: Syntax