Document Readability and Searchability

Document readability and searchability are crucial for all users because they enhance the overall user experience, making it easier to locate information quickly, integrate notes, and ensure that everyone, including those using assistive technologies, can access and interact with the content effectively. 

What is document readability?

Document readability refers to how easily a reader can understand and engage with written content. While searchability is about how easily specific information can be found within a document by using keywords or finding information using headings, subheadings, and hyperlinks.  

Why is it important? 

Document readability and searchability are crucial for all users because they enhance the overall user experience, making it easier to locate information quickly, integrate notes, and ensure that everyone, including those using assistive technologies, can access and interact with the content effectively. 

Web Content Accessibility Guidelines (WCAG) success criteria

Ensuring documents are text recognizable meets Success Criteria 1.1.1: Non-text Content .

How To Ensure Document Readability

The quality of scanned documents can affect readability due to visual residue and distortions. 

An example of a bad scan, see the description below for more info

Figure 1: Example of a poorly scanned file due to numerous speckled artifacts throughout the document, inconsistent inking, slight text skewing, and blurred characters that would significantly impair OCR software's ability to accurately recognize the text. 

 

An example of a good scan, see the description below for more info

Figure 2: Example of a high-quality scan because it contains sharp, clear text rendering, consistent contrast, proper alignment, clean margins, well-defined character edges, and no visible artifacts or distortions, all of which would enable OCR software to accurately recognize and convert the text content. 

Testing Document for Searchability 

  1. Use a PDF reader such as Adobe Acrobat Pro and try highlighting a portion of the text. If you can select individual words or sentences, it is likely the text is recognizable.
  2. To verify the text recognition, use the Search function and type a keyword or phrase in the document and check if you can locate the keyword within the document.  

If you are unable to highlight words or the search result does not return any results, the document is not text recognizable and requires optional character recognition (OCR) to convert images into text.  

Use Optical Character Recognition (OCR) 

Ensure text is recognized as text, not an image, to support accessibility for assistive devices and to facilitate search and note-taking functions. 

If you are working with a PDF file, use the Optical Character Recognition (OCR) tool to produce an editable text file. This is only one step in the process of document accessibility, and you may need to tag the document after using the OCR tool. 

Video Tutorial

Watch the how-to video from Illinois State University for using Adobe Acrobat to add OCR.

If the OCR process produces poor text recognition, please refer to the "Other Considerations" section for additional guidance on improving document quality and ensuring accessibility. This section provides tips and tools to enhance text recognition and address common issues with scanned documents. 

Other Considerations 

Improving OCR (Optical Character Recognition) results can significantly enhance the accuracy and usability of digitized documents. Here are some tips to help you achieve better OCR results: 

  1. Use high-resolution scans to capture clear and detailed text.
  2. Avoid using scans that are highlighted, underlined, or contain written text which may interfere with the OCR processing.
  3. Optimize image quality.
  4. Enhance Contrast: Make the text stand out more clearly against the background. Use tools like Adobe Photoshop to make the text darker.
  5. Reduce Noise: Clean up any unwanted marks, smudges, lines, dark borders or areas that might interfere with text recognition.
  6. Correct Orientation: Make sure the text is properly aligned and not skewed or tilted. Horizontal text lines improve OCR accuracy.
  7. Increase text size to help the program recognize it accurately.

When possible, we recommend reaching out to the UNM Library to schedule a consultation for help in locating and replacing scanned readings you might use in your course with digital files or ePUB (electronic publications) versions that are available. The files available through the UNM Library are often more accessible to work with as a starting point, and you can request readings available through the UNM Library to be added to Course Reserves for your class.  

If the document remains inaccessible even after implementing the provided tips or seeking assistance from the library, consider whether or not it is integral to the learning objectives of your course. If it is, accessibility improvements need to be implemented to ensure compliance with Title II of the ADA and the tenets of universal design principles.