Debugging part 1: A guide to fixing bugs in software applications with the TRAFFIC principle

Let's solve the "Mystery of the Disappearing Car Wheel" together – with help from Andreas Zeller!

Debugging a piece of software can drive the most well-tempered developer nuts – yet, sadly, it’s also inevitable, as no one writes bug-free code. Trying to find out what is wrong with faulty code and where the source of the problem is before you fix it (i.e. debugging) can be a bit of a challenge. As the popular piece of programming wisdom goes:

“Debugging is like a crime movie
where you are the detective, the victim, and the murderer”.

To be fair, correcting a buggy program back to clean-shaven, functioning code can be a curiosity-picking investigative exercise. It can also drive you insane. But at the very least, it will definitely take time out of your day that you could have spent coding new features. Luckily, there are tools and workflows that you can use to make debugging less painful and more efficient like any other step in your development process.

In this blog post, we introduce you to the topic of debugging by presenting the “TRAFFIC” principle, an abstract workflow to structure your first (or thousandth) debugging endeavors. In subsequent parts of our debugging series, we’ll cover useful tools and approaches for debugging, fixing bugs with auto-generated unit tests, and finding simple reproducers.

Debugging with the TRAFFIC principle

A general debugging workflow will have you follow these steps of the TRAFFIC principle as outlined by German computer scientist Andreas Zeller, author of Why Programs Fail:

Track the problem
Reproduce
Automate
Find Origins
Focus
Isolate
Correct

We will now go over each of these steps in more detail, following an example to clarify how they could look like in a real-world scenario.

Let’s assume you have written a little high score tracker for your grandma’s Scrabble club that allows a user to add the contestant’s scores after each game and updates a “club-high-score” website accordingly. Now, your grandma might call you from time to time to tell you with which parts of your program the Scrabble club is particularly (dis-)satisfied with.

Track the issue

First off, you’ll need to have a reliable way to track and manage problem reports. Using an issue/bug tracker will generally be the smart choice. For example: “the high score text is too small to read” is a very different problem than “the math is off in the final leaderboard”. Some source code platforms like “GitHub” and “GitLab” already have a built-in issue tracker ready to go.

Reproduce the problem

Once that’s in place, you’ll want a reliable and simple way to “experience” (trigger and observe) the reported problem yourself. That could even mean reproducing the operating environment a problem was reported in. If grandpa Pete still uses Windows 95 to run your high-score tracker, that might be (part of) the reason for the problem he’s experiencing.

But you’ll also want the most simple reproducer possible. For instance, just because you know that the bug has been reported on Windows 95 doesn’t mean that you’ll need a computer from the 90s to reproduce the failure. Start out as broad as possible and narrow the “circumstances” down when trying to reproduce the malfunction.

Automate

You’ll also want an automated way to provoke the problem. That’s when unit tests come in handy – or, in more complex cases (i.e. when the operating system does play a role), you may have to rely on more sophisticated tests. When fixing the bug, you’ll want to avoid having to spend minutes checking if the problem is still there every time you’ve delivered a potential fix. Even worse, you don’t want to hand your grandma an USB stick with the new, fixed version of the high score tracker, hoping that you really did solve one of her problems. Ideally, you’d just press a button and know instantly if the problem persists.

Find the origins

Next up, you’ll want to get to the root of the problem. By definition, a bug is a failure of the program that a user can observe. For example, if your high score tracker crashes every time someone tries to enter a new Scrabble score, that’s a clearly observable failure. The failure itself usually stems from some invalid state of the program. For example, the player that got the score might not yet exist in the tracker but the program assumes it does. Finally, this invalid state is only reached because of some “mistake” or defect in the code. In our example, we should probably check if the player exists before attempting to assign a new score to it.

There could be tons of possible reasons for a bug, and you have yet to figure out how to get from the observable failure to the actual defect. For instance, if your tracker just crashes, that might have something to do with the score computation, accessing the web service endpoint, saving the scores to a file… You’ll need to get to the origins of the problem, identify what causes it, and understand its implications.

Focus

You’ll then dig into the mechanics of the issue. When analyzing a problem, it’s important to focus on what you expect the program to do vs. what it actually does, as those two can sometimes be wildly different. You’ll want a precise understanding of how the defect in the code provokes the observed failure. Maybe you expect that the database library you used for your high score tracker automatically creates all entities that don’t exist yet – but that may not really be the case, leading to the failure.

Isolate

Once you’ve managed to pin down the defect and what causes it to be materialized as an observable failure, you know exactly where you have to change things. At this point, you may be able to write more granular tests for the faulty component to be able to catch the problem at its roots. If you know that a reported problem for your Scrabble tracker doesn’t concern the web component, there is no need to spin up that web server every time you run your tests again.

Correct

Last, you’ll attempt to either fix the problem or add a workaround to prevent the program from going into the invalid state (and therefore causing an observable failure) even if some problem occurs internally. It’s important in this phase to learn from your own mistakes and take measures to forego future problems. For instance, if you forgot to handle an error, install a linter that warns you not to do that again.

The practical aspects of debugging

That’s the general workflow you’ll follow with any debugging process. How all that is implemented in practice, however, is a different question. In part 2 of this debugging series, we’ll cover some tools and approaches to enhance debugging in everyday scenarios. In part 3, we’ll introduce the use of unit tests to support the debugging process through a practical example. Finally, in part 4 we will explore how you can find simple reproducers for bugs automatically.

We here Symflower know the pain and insanity that can sprout from a long debugging session. But we’re working hard to provide tools that make your everyday software development workflows, including debugging, easier and less painful. Try Symflower next time you’re diving into your, or someone else’s, code: get.symflower.com.

| 2023-02-22