ISBN 978-0-12-374515-6 [US]
ISBN 978-3-89864-620-8 [DE]
Project 1: Simplifying InputIn this project, you write a simple tool that simplifies failure-inducing input. The idea is to use delta debugging to find a minimal input that causes an XML parser to fail.
This project requires you to program in Python. Learning Python is not hard, though; and even if you include the time it takes you to learn Python, you will still complete your work faster than in most other languages.
Read this first
The failureXMLProc is a small XML parser written in Python. It has a small defect. To exhibit the defect, follow these steps:
Your taskUse Delta Debugging to simplify the failure-inducing input. Proceed in eight steps:
Step 1: Write a testing function.Start with a Python program that invokes the parser, as described above, and assesses the result. To avoid complications, you may wish to conduct this work in the
To invoke the parser, you have two options:
Step 2: Write a splitting function.You can start with the split() function as described in the book (Figure 5.9).
In the long term, you may want to split the input along token delimiters, or even use syntactic simplification (Section 5.8.3).
Step 3: Attach Delta Debugging.To complete Delta Debugging, you need an implementation of the listminus() function as described in the book (Figure 5.10). The listsets module gives you exactly this.
Finally, you need the ddmin() function from Figure 5.7. This ddmin module even comes with a self-contained test function.
Step 4: Choose a representation.How do you represent the configuration that is to be minimized?
Step 5: Run it.What is the simplified failure-inducing input in
Step 6: Document it.Be sure to have docstrings for every function which describe its purpose.
Have a README file that describes how to invoke the simplification program.
Step 7 (optional): Make it more general.Convert your testing function into Python's unittest framework.
Generalize the delta debugging function such that it can be parameterized with an arbitrary unit test and an arbitrary input—that is, it becomes independent from XMLProc and can easily be adapted to other input minimization tasks. (As examples of such tasks, see the paper Simplifying and Isolating Failure-Inducing Input.) Eventually, you will have a general stand-alone framework; to have it solve the XMLProc problem, instantiate it with the XMLProc unit test and the failing input.
Step 8 (optional): Improve efficiency.In the book, Section 5.8.3 sketches how to implement syntactic simplification. Using a Python XML parser such as Expat, split the input along the XML structure to narrow down the failure cause more rapidly. Validate the success by counting the number of tests.