Why Programs Fail - A Guide To Systematic Debugging

Learn more

About the Book

About the Author

Contents

Reviews

Buy the book

For readers

Publishers' sites

Paperback · 423 pages
ISBN 978-0-12-374515-6 [US]
ISBN 978-3-89864-620-8 [DE]

All Slides

PDF · Keynote · PPT
About the slides

About the Author

Preface

1 How Failures Come to Be

Your program fails. How can this be? The programmer creates a defect in the code, that, being executed, causes an infection in the program state, that, later on, becomes visible as a failure. To find the defect, one must reason backwards, starting with the failure. This chapter defines the essential concepts when talking about debugging and hints at the techniques discussed later on—hopefully whetting your appetite for the remainder of this book.

Slides

PDF · Keynote · PPT
1.1 My Program Does Not Work!
1.2 From Defects to Failures
1.3 Lost in Time and Space
1.4 From Failures to Fixes
1.5 Automated Debugging Techniques
1.6 Bugs, Faults, or Defects?
1.7 Concepts
1.8 Tools
1.9 Further Reading
1.10 Exercises

2 Tracking Problems

This chapter deals with the issue of how to manage problems as reported by users—how to track and manage problem reports, how to organize the debugging process, and how to keep track of multiple versions. All this is the basic framework in which debugging takes place.

Slides

PDF · Keynote · PPT
2.1 Oh! All These Problems
2.2 Reporting Problems
2.3 Managing Problems
2.4 Classifying Problems
2.5 Processing Problems
2.6 Managing Problem Tracking
2.7 Requirements as Problems
2.8 Managing Duplicates
2.9 Relating Problems and Fixes
2.10 Relating Problems and Tests
2.11 Concepts
2.12 Tools
2.13 Further Reading
2.14 Exercises

3 Making Programs Fail

Before a program can be debugged, we must set it up such that it can be tested—that is, executed with the intent to make it fail. In this chapter, we review basic testing techniques, with a special focus on automation and isolation.

Slides

PDF · Keynote · PPT
3.1 Testing for Debugging
3.2 Controlling the Program
3.3 Testing at the Presentation Layer
3.4 Testing at the Functionality Layer
3.5 Testing at the Unit Layer
3.6 Isolating Units
3.7 Designing for Debugging
3.8 Preventing Unknown Problems
3.9 Concepts
3.10 Tools
3.11 Further Reading
3.12 Exercises

4 Reproducing Problems

The first step in debugging is to reproduce the problem in question—that is, to create a test case that causes the program to fail in the specified way. The first reason is to bring it under control, such that it can be observed. The second reason is to verify the success of the fix. This chapter discusses typical strategies for reproducing the operating environment, the history, and the problem symptoms.

Slides

PDF · Keynote · PPT
4.1 The First Task in Debugging
4.2 Reproducing the Problem Environment
4.3 Reproducing Program Execution
4.4 Reproducing System Interaction
4.5 Focusing on Units
4.6 Concepts
4.7 Tools
4.8 Further Reading
4.9 Exercises

5 Simplifying Problems

Once we have reproduced a problem, we must simplify it—that is, we must find out which circumstances are not relevant for the problem and can thus be omitted. This process results in a test case that contains only the relevant circumstances. In the best case, a simplified test case report immediately pinpoints the defect. We introduce Delta Debugging, an automated debugging method that simplifies test cases automatically.

Slides

PDF · Keynote · PPT
5.1 Simplifying the Problem
5.2 The Gecko BugAThon
5.3 Manual Simplification
5.4 Automatic Simplification
5.5 A Simplification Algorithm
5.6 Simplifying User Interaction
5.7 Random Input Simplified
5.8 Simplifying Faster
5.9 Concepts
5.10 Tools
5.11 Further Reading
5.12 Exercises

6 Scientific Debugging

Once we have reproduced and simplified the problem, we must understand how the failure came to be. The process of obtaining a theory that explains some aspect of the universe is known as the Scientific Method; it is also the appropriate process for obtaining problem diagnostics. We introduce the basic techniques for creating and verifying hypotheses, for making experiments, for conducting the process in a systematic fashion—and for making the debugging process explicit.

Slides

PDF · Keynote · PPT
6.1 How to Become a Debugging Guru
6.2 The Scientific Method
6.3 Applying the Scientific Method
6.4 Explicit Debugging
6.5 Keeping a Logbook
6.6 Debugging Quick-and-Dirty
6.7 Algorithmic Debugging
6.8 Deriving a Hypothesis
6.9 Reasoning About Programs
6.10 Concepts
6.11 Further Reading
6.12 Exercises

7 Deducing Errors

In this chapter, we begin exploring the techniques for creating hypotheses introduced in Chapter 6. We start with deduction techniques—reasoning from the abstract program code to the concrete program run. In particular, we present program slicing, an automated means to determine possible origins of a variable value. Using program slicing, one can effectively narrow down the number of possible infection sites.

Slides

PDF · Keynote · PPT
7.1 Isolating Value Origins
7.2 Understanding Control Flow
7.3 Tracking Dependences
7.4 Slicing Programs
7.5 Deducing Code Smells
7.6 Limits of Static Analysis
7.7 Concepts
7.8 Tools
7.9 Further Reading
7.10 Exercises

8 Observing Facts

While deduction techniques do not take concrete runs into account, observation determines facts about what has happened in a concrete run. In this chapter, we look under the hood of the actual program execution and introduce widespread techniques to examine program executions and program states. These techniques include classical logging, interactive debuggers, and post-mortem debugging, as well as eye-opening visualization and summarization techniques.

Slides

PDF · Keynote · PPT
8.1 Observing State
8.2 Logging Execution
8.3 Using Debuggers
8.4 Querying Events
8.5 Visualizing State
8.6 Concepts
8.7 Tools
8.8 Further Reading
8.9 Exercises

9 Tracking Origins

Once we have observed an infection during debugging, we need to find out its origins. We discuss omniscient debugging, a technique that records an entire execution history such that the user can explore arbitrary moments in time without ever restarting the program. Furthermore, we explore dynamic slicing, a technique that tracks the origins of specific values.

Slides

PDF · Keynote · PPT
9.1 Reasoning Backwards
9.2 Exploring Execution History
9.3 Dynamic Slicing
9.4 Leveraging Origins
9.5 Tracking Down Infections
9.6 Concepts
9.7 Tools
9.8 Further Reading
9.9 Exercises

10 Asserting Expectations

Observation alone is not enough for debugging—one must compare the observed facts with the expected program behavior. In this chapter, we discuss how to automate such comparisons, using well-known assertion techniques; we also show how to ensure the sanity of important system components such as memory.

Slides

PDF · Keynote · PPT

Slides

PDF · Keynote · PPT
10.1 Automating Observation
10.2 Basic Assertions
10.3 Asserting Invariants
10.4 Asserting Correctness
10.5 Assertions as Specifications
10.6 From Assertions to Verification
10.7 Reference Runs
10.8 System Assertions
10.9 Checking Production Code
10.10 Concepts
10.11 Tools
10.12 Further Reading
10.13 Exercises

11 Detecting Anomalies

While one program run can tell you quite much already, having multiple runs to compare offers several opportunities to locate commonalities and anomalies—anomalies which frequently help to locate defects. In this chapter, we discuss how to detect anomalies in code coverage and anomalies in data accesses. We also show how to infer invariants from multiple test runs automatically, in order to flag later invariant violations. All these anomalies are good candidates for infection sites.

Slides

PDF · Keynote · PPT

Slides

PDF · Keynote · PPT
11.1 Capturing Normal Behavior
11.2 Comparing Coverage
11.3 Statistical Debugging
11.4 Collecting Data in the Field
11.5 Dynamic Invariants
11.6 Invariants on the Fly
11.7 From Anomalies to Defects
11.8 Concepts
11.9 Tools
11.10 Further Reading
11.11 Exercises

12 Causes and Effects

Deduction, observation and induction are all good in finding potential defects. However, none of these techniques alone is sufficient to determine a failure cause. How does one identify a cause? How does one isolate not only "a" cause, but "the" actual cause of a failure? This chapter lays the groundwork on how find failure causes systematically—and automatically.

Slides

PDF · Keynote · PPT
12.1 Causes and Alternate Worlds
12.2 Verifying Causes
12.3 Causality in Practice
12.4 Finding Actual Causes
12.5 Narrowing Down Causes
12.6 A Narrowing Example
12.7 The Common Context
12.8 Causes in Debugging
12.9 Concepts
12.10 Further Reading
12.11 Exercises

13 Isolating Failure Causes

This is the chapter that automates most of debugging. We show how Delta Debugging isolates failure causes automatically—in the program input, in the program's thread schedule, and in the program code. In the best case, the reported causes immediately pinpoint the defect.

Slides

PDF · Keynote · PPT
13.1 Isolating Causes Automatically
13.2 Isolating versus Simplifying
13.3 An Isolation Algorithm
13.4 Implementing Isolation
13.5 Isolating Failure-inducing Input
13.6 Isolating Failure-inducing Schedules
13.7 Isolating Failure-inducing Changes
13.8 Problems and Limitations
13.9 Concepts
13.10 Tools
13.11 Further Reading
13.12 Exercises

14 Isolating Cause-Effect Chains

This chapter presents a way to narrow down failure causes even further. By extracting and comparing program states, Delta Debugging automatically isolates the variables and values that cause the failure, resulting in a cause-effect chain of the failure: "variable x was 42; therefore, p became null; and thus the program failed".

Slides

PDF · Keynote · PPT

Slides

PDF · Keynote · PPT
14.1 Useless Causes
14.2 Capturing Program States
14.3 Comparing Program States
14.4 Isolating Relevant Program States
14.5 Isolating Cause-Effect Chains
14.6 Isolating Failure-inducing Code
14.7 Issues and Risks
14.8 Concepts
14.9 Tools
14.10 Further Reading
14.11 Exercises

15 Fixing the Defect

Once we have understood the failure's cause-effect chain, we know how the failure came to be. But still, we must find the place where the infection begins—that is, the actual location of the defect. In this chapter, we discuss how to narrow down the defect systematically—and, having found the defect, how to fix it.

Slides

PDF · Keynote · PPT
15.1 Locating the Defect
15.2 Focusing on the Most Likely Errors
15.3 Validating the Defect
15.4 Correcting the Defect
15.5 Workarounds
15.6 Learning from Mistakes
15.7 Concepts
15.8 Further Reading
15.9 Exercises

16 Learning from Mistakes

At the end of each debugging session, one wonders how the defect could have come to be in the first place. We discuss techniques to collect, aggregate, and locate defect information; techniques to predict where the next defects will be; and what to do to prevent future errors.

Slides

PDF · Keynote · PPT
16.1 Where the Defects are
16.2 Mining the Past
16.3 Where Defects come from
16.4 Errors during Specification
16.5 Errors during Programming
16.6 Errors during Quality Assurance
16.7 Predicting Problems
16.8 Fixing the Process
16.9 Concepts
16.10 Further Reading
16.11 Exercises

Appendix: Formal Definitions

A.1 Delta Debugging
A.2 Memory Graphs
A.3 Cause-Effect Chains

Glossary

Bibliography

Index

Get the book at Amazon.com · Amazon.de
Comments? Write to Andreas Zeller <zeller@whyprogramsfail.com>.

Table of Contents

About the Author

Preface

1 How Failures Come to Be

2 Tracking Problems

3 Making Programs Fail

4 Reproducing Problems

5 Simplifying Problems

6 Scientific Debugging

7 Deducing Errors

8 Observing Facts

9 Tracking Origins

10 Asserting Expectations

11 Detecting Anomalies

12 Causes and Effects

13 Isolating Failure Causes

14 Isolating Cause-Effect Chains

15 Fixing the Defect

16 Learning from Mistakes

Appendix: Formal Definitions

Glossary

Bibliography

Index