Inspired by the Isartor test set for validating PDF/A compliance we are working on a similar style set of negative tests for basic PDF compliance (ISO-32000-1).
The PDF/A Competence Center defines the problem quite clearly in their Isartor Test Suite Manual:
"PDF/A-1 adds specific requirements on top of the PDF 1.4 file format. PDF itself is a highly complex file format, and a full test suite for PDF is beyond the scope of the Isartor test suite. The Isartor test suite assumes valid PDF 1.4, and only checks PDF/A-1 violations on top of this."
One of the biggest weaknesses in the PDF ecosystem is the tolerance that Adobe® Acrobat® Reader has for non-compliant files. While this may seem like a harmless user friendly behavior it has the unpleasant side effect of setting a very low bar for PDF quality for software developers world wide.
This gave us a good place to start working on the daunting task of creating a set of negative tests for PDF compliance. We decided to focus on test cases that are blatantly non-compliant but are tolerated by Acrobat Reader. With this approach, our tests are not just arbitrary coverage of what is an enormous specification but instead focused on a smaller subset of issues that we have encountered in our vast set of test PDF files gathered at www.freepdftoword.org (free online pdf to word) and www.validatepdfa.com (free online PDF/A validator)
Our work is comprised of two sets of tests: those that validate PDF 1.4 compliance (useful for validating PDF/A-1) and those which validate more recent features of ISO 32000-1
PDF is not PDF/A
These test files are not always PDF/A compliant. They are intended to validate the PDF specification so PDF/A constraints are not always appropriate. A PDF/A validator can be improved using this test set to confirm that it first checks for PDF compliance before moving on to PDF/A compliance checks.
Where are the tests?
The PDF Compliance test set is only available to PDF/D Consortium members. The tests are simply a ZIP file containing PDF test case files which each fail PDF ISO-32000-1 validation for different but specific reasons. This screenshot shows a sampling of the tests illustrating the use of the ISO 32000-1 clause and the test name used in naming each test case file: