The Adobe Portable Document Format (PDF) provides a convenient way to transport, view, and print electronic documents. PDF files are typically created in other applications like Microsoft Word and PowerPoint. They can also be produced by scanning the pages of a book or journal article. At Colorado State University, PDFs are often posted on RamCT or the Library’s E-Reserve as a way of sharing articles and lecture organizers with students.
This module examines some of the features that make PDF documents more usable by a diverse audience. It also provides techniques for improving the accessibility of existing and new PDF documents.
Note: This module refers to two Adobe software applications: Reader and Acrobat. Adobe Reader is a free application that is limited to viewing, searching, and printing PDF files. Adobe Acrobat, by contrast, is a full-featured application that allows for the creation and editing of PDF documents.
This document assumes you have Adobe Acrobat version 7 or later and Microsoft Office 2003 installed on your computer. [Editor’s note: Acrobat 8 and Microsoft Office 2007 were just arriving at the time of publication. An updated version of this document is scheduled for spring of 2008.]
The Features of an Accessible PDF
Images of Text vs. Real Text
One of the biggest barriers to PDF accessibility occurs with “image-only” documents. An image-only PDF contains no actual text, only an image of text. This type of PDF is typically created by scanning a printed page. Lacking any real text, an image-only PDF does not support valuable features such as searching, highlighting, text copy, and text reflow; nor will it allow the use of Adobe’s magnification and reading tools.
There are two solutions to the “image-only problem.” One is to recreate the PDF directly from the electronic source document. When that is not possible, the second solution involves converting the image back to real text using optical character recognition software (see OCR below).
Each PDF document contains a navigation pane called “Bookmarks.” Each bookmark can be linked to a specific location in the document. It can also be linked to web pages or multimedia content. As a group, bookmarks function as a table of contents. They provide a conceptual display of the document structure and help users understand where they are relative to the rest of the document.
Bookmarks can easily be added and edited in Adobe Acrobat. To create a bookmark, choose Edit > Add Bookmark (Ctrl+B). You can also create a bookmark from the Navigation pane: choose Options > Add Bookmark (Ctrl+B). (See Figure 2-a.) If you select some text from the document prior to creating the bookmark, that text will automatically become the bookmark’s target (i.e., its linked destination) and label.
To edit an existing bookmark, first click on it, then choose Options > Properties. Alternately, you can right-click on the bookmark and choose Properties (Ctrl+I). (See Figure 2-b.) Bookmarks can also be generated automatically from the headings of a Microsoft Word document (see Converting Electronic Documents to PDF below).
What are PDF tags? The technical answer is this: Tags define the function and order of content in a document. Many PDF tags resemble their counterparts in HTML. For example, there is a <p> tag for paragraphs, <table> and <td> for tables and table cells, <h1> for level 1 headings, etc. Although they are normally invisible, tags provide valuable interpretive cues to assistive technology like screen reader software, and are therefore an essential part of PDF accessibility.
Fortunately, tags are generated automatically during conversion from Microsoft Office applications using the “Adobe PDF” menu. The Adobe PDF menu is a part of Acrobat PDFMaker, an application added to Office applications when Acrobat is installed.
Tags can also be added manually in Acrobat by choosing Advanced > Accessibility > Add Tags to Document. Once applied, tags can be edited to improve or customize document organization.
A Closer Look at Tags
The tags of a PDF document are organized according to a logical structure tree. Assistive technology software depends on this structural information to determine the appropriate reading order of text and to convey the meaning of images and other content in an alternative format, such as text-to-speech. Because untagged documents lack a logical structure tree, Acrobat must infer a structure based on the Reading Order preference setting. This often results in page items being read in the wrong order or not at all. For more information about tags, read order, and the logical structure tree, see the Adobe Acrobat Help files.
Zoom and Reflow
One advantage of the PDF format is its ability to magnify a page up to 6400%. However, when magnification is increased, text can be pushed off-screen so that it is no longer visible without scrolling left or right. To make the document easier to read with magnification, Adobe offers a function called Reflow.
Reflow rewraps lines of text to fit within the display window, making scrolling unnecessary. This feature also works with the small screens of a mobile internet devices. Note that the Zoom and Reflow work only when a document contains real text and is tagged.
To use the Reflow feature, choose View > Reflow (Ctrl+4).
Creating an Accessible PDF
Scanning to PDF
It is best to create a PDF from an original electronic document (see Converting Electronic Documents to PDF below). However, if scanning is the only method available, take care with the scanning process. Use a clean copy of the article and place it squarely on the scanner. Avoid pages that have been photocopied multiple times. Obtaining a clear image of the text is essential to the next step in the creation of an accessible PDF document: optical character recognition.
Optical Character Recognition (OCR)
To create a more versatile version of your scanned document, use OCR to convert the image of text to real, editable text. Many scanners offer a “copy as text” feature, which is useful for small blocks of text. Multipage documents can also be scanned and converted to text using OCR, with all of the text stored in a single file.
Adobe Acrobat should bring up a prompt to run OCR when scanning a document, or the option can be chosen from the menu: Document > Recognize Text Using OCR > Start. Note that character recognition is not foolproof, and proofreading is always recommended after using OCR.
The accuracy of OCR is affected by several factors:
- The quality of the source image (i.e., whether the text is crisp and legible)
- The typeface (font) of the original text
- Whether the page was squarely positioned on the scanner
- The resolution of the scanned image, often measured in dots per inch (DPI)
- The quality of the OCR software
OCR tools vary in quality and accuracy. The OCR engine built into Adobe Acrobat is adequate for most jobs. Professional level tools like Omnipage Pro, Abbyy FineReader, or Adobe Capture may be better for large or difficult jobs. Also, the resolution of your scan may help determine the success of OCR. Setting the scanner to 300 dpi (dots per inch) usually yields good results, although occasionally a setting of 600 dpi may be required, especially when working with small type. Resolutions higher than 600 dpi produce larger files with no increase in OCR accuracy.
Be aware that many free and inexpensive PDF creation tools may not produce high quality, usable results. Check the output specifications of the software you wish to use to determine if it can create PDFs with the accessibility features outlined in this document.
Converting Electronic Documents to PDF
The best way to create a PDF is to convert an existing electronic document to the PDF format. Creating a PDF from an electronic original is simple and fast, and creates a more accessible, tagged PDF document. Once Adobe Acrobat software is installed, a plug-in or helper application called PDFMaker is added to each of the Microsoft Office programs, including Word, PowerPoint, Internet Explorer, and others. PDFMaker adds a new menu, Adobe PDF, to the menu bar or ribbon of each Microsoft Office application
Microsoft Office 2003
Many of Word’s structural elements, including headings, tables, alternative text descriptions of images, and tables of contents will transfer to PDF using the Adobe PDF menu. In fact, applying styles to create these structural elements in Word is one of the most important steps in the creation of an accessible PDF.
During the conversion to PDF, Word headings become “bookmarks,” which function as a table of contents and navigation menu in the PDF document (see Bookmarks above). To control which headings and styles are converted to bookmarks, choose Adobe PDF > Change Conversion Settings. Select the "Convert Word Styles to Bookmarks" checkbox, then deselect any heading levels you do not wish to appear as bookmarks in the PDF.
Note that “Adobe PDF” may also be a choice in the Print dialog box once Adobe Acrobat has been installed. Unfortunately, PDFs created using the Print option are “untagged,” making them largely inaccessible. (see Tags above).
To create a PDF file from any Microsoft Office application, choose Adobe PDF > Convert to Adobe PDF. If you’re starting with a Word document, make sure the document is structured using Styles and headings.
Figure 3: Creating an Accessible PDF
To create an accessible PDF from Word, use the Adobe PDF menu. Choose Adobe PDF > Convert to Adobe PDF.
For more information about converting a Word document to PDF, see Preparing for Conversion to Accessible HTML and PDF. Also see the module Microsoft Word, Universally Designed).
Microsoft Office 2007
New in Microsoft Office 2007 applications, including Access, Excel, PowerPoint, and Word, is the ability to create tagged PDF content without the use of Adobe Acrobat software. In order to enable this capability, a download (“add-in”) from Microsoft needs to be installed on the computer, and the document must be styled appropriately. The download can be found at 2007 Microsoft Office Add-in: Microsoft Save as PDF.
PDF forms provide a convenient way to present and gather form data. Simple forms are static presentations of information and blank fields, often in a table format. Such forms can be printed, filled out by hand, and mailed to their final destination. More complex forms are interactive, allowing the user to fill in the data electronically and even submit the information online with the click of a button.
Like all PDF documents, forms require care in their creation to ensure they can be used by everyone, including those who must navigate using only the keyboard. Fortunately, interactive, accessible PDF forms have become easier to produce. Using the new Adobe Designer application that comes with Acrobat Professional version 7 and higher, it is now relatively simple to create a form that e-mails data to a specific address or stores the information in a database.
To avoid user confusion, it is helpful to label forms in a way that indicates how they should be used. For example, you might label a form “interactive,” “fillable,” or “printable.”
To create a new form in Adobe Acrobat, choose Advanced > Forms > Create New Form. Adobe Designer will open.
For an example of an interactive form, see Purchase-Order.pdf. A full tutorial is also available through the Adobe Designer Help menu. Choose Help > Adobe Designer Help, then select the Quick Start Tutorials.
Increasing the Accessibility of an Existing PDF
Reducing File Size
Once a PDF file has been created, you can do several things to increase its accessibility. One option is to reduce its file size and therefore its download time. Another is to enable a feature called Fast Web View. Both options are available from the PDF Organizer dialog box. Choose Advanced > PDF Optimizer.
PDF Optimizer provides options for downsampling and compressing images, removing invalid links, and discarding items like embedded fonts, user comments, etc. Be careful not to discard elements that increase the accessibility and usability of the document, like tags and bookmarks. Also, consider carefully whether the PDF will be printed, because downsampled images, especially images containing text or small details, may become blurry and unreadable.
Adding Descriptive Properties
Each Microsoft Office document contain Properties—information about the author, keywords that describe the content, and more. Document properties can be assigned via File > Properties (MS Word 1997-2003). Similarly, Adobe PDF documents created in Acrobat 5.0 or later contain the same types of properties, which in Acrobat are referred to as “metadata.”
During conversion to PDF, Acrobat preserves document properties from Microsoft Office applications. This metadata in a PDF file can then be indexed by internet search engines, making the PDF more “findable” on the Web and allowing its title to be displayed in a list of search results.
To add or edit PDF metadata, choose File > Document Properties (Ctrl+D), then choose the “Description” tab. When entering keywords, choose terms that others might use when searching for your document. Language and Copyright Information can also be added to a PDF using the "Advanced" tab under File > Document Properties.
Figure 5: Document Properties Become “Metadata”
Many of Word’s “Document Properties” transfer to Adobe PDF. You can view them by choosing File > Document Properties.
For a more complete discussion of document properties and metadata, in Word and Adobe PDF, see “Document Title and Other Metadata” in the tutorial Preparing for Conversion to Accessible HTML and PDF.
Security and Usability Issues
PDF files can be locked so that users can view the document, but not change, extract, copy, or print its contents. In the most recent versions of Acrobat, the default security settings do not preclude use by assistive technologies, unless printing is part of the way the reading programs access the material. Unfortunately, several common reading programs do work in this manner.
In order to preserve accessibility, security settings must not prevent legitimate, legal adaptations of copyrighted works for people with disabilities. They must allow for the production of captions, audio descriptions, subtitles, and dubbing. Consider carefully whether you really need to apply security to a PDF document, especially if it will be posted in a password-protected environment like RamCT.
Checking for Accessibility
Before publishing your PDF, it’s recommended you perform an accessibility check to ensure the reading order is correct and reflow will work, that the page is actually text, and that a form will read correctly. One method is to save the document as a “Text (Accessible) (*.txt)” file, then open the file in a text editor/word processor. Spot check the document in places where there may be concerns about the reading order of the content. Content that is read by a screen reader will be identified in brackets [ ].
Another method involves Acrobat’s own Accessibility Checker. In Acrobat, choose Advanced > Accessibility > Full Check. If accessibility problems are found, fix them by following the instructions provided in the Accessibility Report. In many cases, the simplest solution may be to return to Word or the original authoring application, fix the problem, then recreate the PDF. For example, missing alternative text for images can usually be resolved in this way.
Universal design emphasizes flexibility and multiple representations of information, and the Adobe Portable Document Format is a useful tool for achieving that goal. However, for PDF documents to be universally usable and accessible, they must be created with accessibility in mind using the tools and techniques outlined in this module.
The elements that increase a PDF document’s usability are the same things that make it universally designed:
- Clean and consistent presentation with information clearly organized
- Search engine friendly (use metadata)
- User input, output, and interaction available (avoid unnecessary security settings)
- Easy to navigate (include bookmarks and make sure the document is tagged)
- Loads quickly (reduce file size)
- Adobe Reader 7.0 Quick Start Guide
- A short users guide (in Word format) from the Assistive Technology Resource Center (ATRC) at Colorado State University that highlights the Read Out Loud feature.
- Is PDF accessible?
- An overview of PDF barriers to accessibility both historically and for current versions from the University of Washington AccessIT project
- PDF Accessibility
- An older archived Placeware presentation by the University of Washington AccessIT that plays in Windows Media Player and is fully voiced.
- Accessibility Resource Center
- Accessibility examples, links and resources, case studies, White Papers, Tools, Developer Center and Accessibility by Product for Adobe and Macromedia.
- The Adobe Acrobat 7.0 family and accessibility
- Overviews, Frequently Asked Questions (FAQs), and How-to guides, as well as the differences in levels of functionality between the Acrobat 7.0 products.
- Review of CommonLook™ Section 508 Plug-In for Adobe Acrobat
- WebAIM's review of this commercially available PDF creation tool offers good tips on who it works best for, and information about its strengths and weaknesses.
- CommonLook™ Section 508 Plug-In for Adobe Acrobat
- Creating Accessible PDFs
- The High Tech Center Training Unit of the California Community Colleges have created a very complete guide to creating accessible PDFs (in accessible PDF format). They also offer an overview and a presentation on Accessible forms.
- 2007 Microsoft Office Add-in: Microsoft Save as PDF
- Microsoft's PDF writer will create tagged PDFs from styled content created in Microsoft applications.