Automate Your PDF Testing
Introduction
Automated testing of PDFs has traditionally been a challenging task in test automation. Bitmap imaging, the legacy state-of-the-art approach, has proven unreliable. Most automation teams have focused on their end-to-end user tests, except PDF tests. Instead, they have relegated PDF validation to manual testers. And, alas, manual testing is prone to error.
In this article, we will review the requirements of PDF testing. We will also cover approaches to automating PDF tests using Applitools.
Why PDF Testing?
More and more organizations transform their business process to online applications. The digital operating model requires that documents be electronically produced and sent to their customers. Assume a customer visits an insurance company or a bank to open an account. Increasingly, this work occurs exclusively with electronic records. After a successful setup, the institution sends a digital copy of the record to the customer. PDF offers the most sophisticated document layout and necessary security to serve as an electronic record. Account statements, invoices, receipts, documentation, and disclaimers get distributed as PDFs.
Organizations cannot own all possible UI failures. When users generate their own PDFs via the local operating system, they may encounter errors. But, when organizations generate receipts, account statements, and other documents for customer download, the organizations clearly own problems with those. A nd, in regulated industries, PDF problems can incur regulator wrath in addition to customer frustration. Thus, testing the generated outputs is mandatory from both quality and legal perspectives.
Organizations use Applitools to uncover problems in their transactional or customer-related PDF documents
What to Automate?
In regulated sectors like insurance, medical, and banking, end-user documents need to be accurate. Regulators who uncover inaccuracy in published documents can impose fines and other penalties. Organizations need to ensure that the PDFs are fully tested before being published to recipients.
Consider an application producing customer letters using a PDF template. Various sections of the PDF are dynamically updated with the customer data. A test for PDF needs to verify that both the content and layout of the output document are correct.
When testing for layout, the document should be fully formed with:
- the specific sections present
- in the right location and
- in the right order.
When testing for content, testers need to ensure that:
A layout failure can impact the processing of the documents by downstream systems. Testers need to be able to check content on a page or on locations on all or selected pages.
How traditionally organizations did PDF Testing?
Let us consider the following (two pages) policy document which is published by a large telco provider with critical summary information of their services:
Usually, organizations take an approach of validating the data using API testing and finally using solutions such as PDF boxes to test them on a page. However, most organizations rely on manual testing to validate the full document output. As organizations generate increasing numbers of electronic documents, they see the scope of effort needed for comprehensive testing. These organizations test PDF a sample of their documents manually.
Application of Visual AI in testing PDF
Applitools is an AI-powered Visual testing platform. Using a range of algorithms, Applitools compares images with the context of the human eye and speed of computer.. Applitools can test any user interface with 99.99% accuracy. Applitools reports only real differences visible to the human eye. These include any changes to color, contrast, position, size, or content.
In the case of a PDF, Applitools lets users inspect the entire document, or just selected pages. The scope of comparison matches the needs of the test. Users can target specific sections of a PDF for testing and ignore sections that are not relevant to a test. And, armed with the Applitools layout algorithm, users can validate a PDF layout even if the internal text has changed.
PDF Testing solution
Applitools PDF Tester is a codeless utility for automating the PDF testing of any document using Visual AI. You can validate the content in a page or a region across selected pages or all pages of the PDF.
Look at how the previous example can be tested using Applitools PDF Tester. Configure the test job as an XML. Add the content validations as test assertions. You can create the XML manually, or you can build it programmatically using a script. Once completed, execute the PDF test using the Applitools PDF Test application. You can run the test from the command line using any batch process.
Following are the results:
The utility reports all the assertions in the PDF document and reports the result as ‘Passed’ or ‘Failed’. In addition, the utility tests the fully formatted output document against a baseline and reports any differences discovered.
Logging into Applitools dashboard we can review all the differences spotted by AI:
Here, AI is highlighting all the content, size, color, positioning, and font differences. Engineers can validate the differences, ensuring document accuracy prior to publication. We can further instruct AI to specifically test or ignore sections of the page by using annotations.
Conclusion
While organizations have largely automated web and mobile application tests, they have still struggled to automate testing of PDFs. Utilizing the capabilities of AI in testing for a completed document has proven to solve PDF test automation. By adding both full document and dynamic data tests, teams can incorporate PDF testing in their end-to-end test automation.
Originally published at https://applitools.com on March 5, 2021.