Debugging part 2: Tools and approaches for debugging

An illustration about tools and approaches for debugging software.

In part 2 of our post series on debugging, we’re covering the practical aspect, providing some useful workflows and tools that you can use to put the TRAFFIC principle into action when debugging your software applications.

Read part 1 of this debugging series for an introduction to the TRAFFIC principle which outlines a general workflow for fixing software bugs.
In debugging part 3, we cover how unit tests help helpful in debugging.

General tips for efficient debugging

Let’s start with some general pieces of advice to help you streamline debugging so that the experience is not as painful:

Change only one thing at a time. Never multitask during debugging. Otherwise, you’ll find that it’s quite easy to get confused as you try to investigate what’s going wrong.
Work upside down. Try to figure out where the bug is not by ruling out “healthy” parts of your code. This way, you can better focus your debugging efforts without wasting time re-checking code that works.
Don’t trust anything - not even yourself. And expect bugs to pop up anywhere in your code. Make sure to first validate all the assumptions you have about your program before you step into debugging, e.g. if you assume a problem cannot be in the setup code but has to be further down the execution track, check that assumption. Trusting parts of your code can be a huge mistake as you’ll waste valuable time looking for bugs where you won’t find them.
Involve a third party. Explain the problem to someone else to force yourself to systematically think about it. Yes, rubber duck debugging may actually work.
Take a break. Looking for a bug for extended periods of time can get mentally exhausting. Being too focused and tired at the same time isn’t likely to help solve the problem. Take some time out to let things settle, and you may find that the bug is easier to locate and fix with a fresh mind.

Static (“information-based”) debugging

Sometimes you cannot observe your running program and you can only debug via some information obtained during or after the execution of your program. Most of the time this happens when users report bugs. Techniques that aid you in debugging when you cannot observe a live version of your program are discussed in this section below.

Log files / tracing

It’s often a good idea to “capture” information about a program during its execution, be it via manual “print” statements, specialized tracing tools, or logging frameworks. This information is similar to the one we would wish to obtain using a dedicated debugger (which parts of the program were executed, which values were dealt with). But in some cases, we’re just not able to run the program with a debugger, e.g. at a user’s machine. Setting up your code with “good logging” for such scenarios would be a blog post on its own, but we’ll mention two tips here.

First, “structured logging” helps better organize the logging information which is great for bigger, more complex applications. You will build up your log entries as a collection of ordered data rather than mere text, making it easier to search and aggregate data. For example, if you standardize the formatting of your logs and always print which component is responsible for the message, it’s easy to build a regular expression to filter the logs for a certain component only.

Second, leveraging different “logging levels” can help you quickly change the “granularity” of the data you get. During normal execution, it might be enough to record only high-level information such as “web server started”, but during debugging it can be very helpful to log every request that the mentioned web server receives.

Here’s an example of what such a log file would look like:

03/22 08:51:01 INFO   :..settcpimage: Associate with TCP/IP image name = TCPCS
03/22 08:51:02 INFO   :..reg_process: registering process with the system
03/22 08:51:02 TRACE  :..reg_process: attempt OS/390 registration

Source: https://www.ibm.com/docs/en/zos/2.4.0?topic=problems-example-log-file

Post-mortem debugging

Log files, as mentioned above, serve already as a first kind of “post-mortem” debugging approach, after the program has crashed. Depending on the operating system and the programming language used, a crash can produce some additional artifacts called “core” or “crash” dumps. These are essentially snapshots of the program right before it died and can therefore contain valuable information to help find the reason. We’ll add some more advanced resources here how to work with “core”-dumps in Java and Go.

Finally, an extremely common type of “post-mortem” debugging information (that you’re most likely already familiar with) are the “stacktraces” you see in your IDE or terminal when an exception occurs. They contain a stack of all the functions that your code was in when the exception was encountered, which immediately points you to the location of the problem. Some programming languages such as Go even tell you which arguments were used to call said functions. Here’s an example of what a stacktrace looks like:

java.lang.Exception: Stack trace
    at java.base/java.lang.Thread.dumpStack(Thread.java:1380)
    at com.example.myJavaProject.Example.f4(Example.java:25)
    at com.example.myJavaProject.Example.f3(Example.java:20)
    at com.example.myJavaProject.Example.f2(Example.java:15)
    at com.example.myJavaProject.Example.f1(Example.java:10)
    at com.example.myJavaProject.Example.main(Example.java:6)

Source: https://rollbar.com/blog/java-stack-trace/

Dynamic (“interactive”) debugging

The most powerful debugging tools let you inspect your code interactively while your program is running. Giving you the ability to see exactly what it is doing under the hood at runtime. The good news is that most of these tools are readily available no matter what editor or environment you’re already using.

IDE debugger

Almost every major IDE for each popular programming language features an integrated debugger. It might require some configuration, like telling the debugger which program exactly you want to inspect and how to compile it. But there are millions of tutorials out there for each and every imaginable setup. For example, here are the IntelliJ IDEA and Visual Studio Code guides to get you started. Once you’re set up, the debugger enables you to pause program execution at so-called “break points”, walk through the lines of your code step-by-step, inspect the values of all your global and local variables and much more. The only limitation is that using a debugger for multi-threaded code might be more tricky, or in some rare cases even impossible to set up.

A screenshot of the debugger integrated in VS Code

Source: https://code.visualstudio.com/docs/editor/debugging

Record and replay / time travel debugging

Some tools such as undo.io go even further than just being a normal debugger. They let you record a program run and reconstruct or “replay” it afterwards. You can even “go back in time”, essentially rewinding the program execution to earlier moments. This can be very useful to record test failures directly in a CI when they occur.

Remote debugging

Remote debugging refers to the scenario where a debugger is used to inspect a program that is being executed on a separate machine. While this might sound like a very special and sparsely applicable scenario, with modern setups using virtual machines and containers, this can come in very handy. In fact, when using something like Visual Studio Code Remote Development to develop on a separate host machine or within Windows Subsystem for Linux, the integrated debugger of Visual Studio Code just magically functions as a remote debugger out of the box.

Source: https://code.visualstudio.com/docs/remote/remote-overview

Advanced debugging tips to save even more time

There are some smart methods that might save you a good chunk of time (or even head-scratching) during debugging. You can, of course, decide for yourself as to what extent you want to utilize these.

“Wolf fence” debugging

Computer Scientist “Edward J. Gauss” proposed this technique in 1982. If you know Binary search (the thing you intuitively do when using a phone book), you also know the “Wolf Fence” algorithm, which just transfers this principle to debugging. For example, when you’re looking for the cause of a problem in your code, just check if all values are correct in the middle between the program entry and where you can observe the error. If something’s off, the defect must be before that, otherwise afterwards. Repeat this checking until you find the problem. The same idea can also be applied to version control, e.g. if you want to figure out which commit introduced some error.

Even better, the git bisect utility does most of the hard work for you, walking you through the commit history in a “binary search” fashion. You just need to check if the problem is observable at a given commit or not, for example by quickly running a unit test that triggers the potential problem. This information is then fed back into the git bisect algorithm, which will ultimately compute the exact commit which introduced the problem.

Delta debugging

Assume you have a bug report consisting of a gigantic, ten-thousand line input, e.g. a json file, that leads to a crash when processed by your program. Figuring out precisely what part of this triggers the problem is a nightmare! A delta debugging tool takes this input and tries to minimize it by e.g. randomly removing certain parts. This might take a lot of compute power but is definitely less tedious than doing the same thing by hand. In a more general sense, delta debugging tries to (systematically) narrow down the circumstances that lead to a problem. Some open-source tools in this category are delta for input files and perses for reducing source code directly, and tavor, a generic fuzzer and delta debugger.

Disclaimer: Since “Wolf Fence” debugging and delta debugging represent very similar concepts, these terms are often used interchangeably. We don’t want to dictate here which one is right or wrong, or which one came first – that’s up to the reader’s judgment.

Assertions

This one is rather simple: if you’re deep into a debugging session already, regularly add “assertion” statements to ensure whatever you assume about the program and the handled values is correct. Whenever things change as you are fixing the bug, you will be notified instantly which crucial parts you might have broken and where. And once you’re done, you can likely clean up quickly with the help of your version control.

| 2023-04-11