Icon for gfsd IntelliJ IDEA

Find and fix bugs with generated unit tests

Sometimes when you fix a bug you create new ones.

Nasty software bugs are part of every developer’s daily life. Nobody writes bug-free code. The more complex an application gets, the easier it becomes to introduce problems. While finding and fixing bugs can be a rewarding experience, there are problems that cost us a lot of time and also our sanity. In this blog post we will explore how Symflower can help us find, reproduce and fix bugs to ease our daily software developer struggles.

Scenario

For this little debugging session we will look at a very simple hash function. Hash functions are used in many computer science domains from security and cryptography to efficient algorithms and compression. A hash function maps a set of input values to output values according to specific requirements that depend on the context where they are used. For example in cryptography, the output must completely obscure the input values and the hash function must be hard to reverse.


You can follow along the example by installing the free version of Symflower for your editor or console. The examples are done in Java but are easily adaptable to other languages.


The following snippet shows a very simple hashing function babyHash that receives an integer and returns a hashed integer. We might want to compare two telephone books and find the common contacts without compromising the actual numbers. To achieve this, we create a hash of each number and compare the hashes to find identical entries.

public class Hashing {
	public static int babyHash(int in) {
		int div = 3;
		for (int i = 0; i < 5; i++) {
			in = in / div;
			div = div + in;
		}
		return div;
	}
}

We repeatedly divide the input through some divisor and update the divisor by adding the remaining input. After five iterations, we return the divisor as our hash value. Of course this is no perfect hash function but it avoids collisions if we throw random inputs at it.

babyHash(5486); // 1831
babyHash(3738); // 1249
babyHash(9541); // 3183

Now, we submit this function to a security forum for beginners to reverse it or put it on GitHub to show off just how bad it is.

The Phone Call

About a week later, a bank calls us and tells us that they’ve used our hash function in a very security-critical component but it crashes their software for input -17. It’s not immediately clear what’s wrong with this input, which is very common for many real-world bug reports. As they repeatedly beg us to fix this bug, we might as well take a closer look at the problem.

Before we even start to debug and fix this problem, we need to make sure that we don’t break anything. So first, we use Symflower to automatically generate unit tests for the code snippet to freeze and inspect its current behavior. To do so, we simply execute the symflower command in the folder containing our code and obtain a small unit test suite. Alternatively, our Visual Studio Code extension will generate these tests automatically.

public class HashingSymflowerTest {
	@Test // (expected = ArithmeticException.class)
	public void babyHash1() {
		int pin = -9;
		int actual = crypto.Hashing.babyHash(pin);
	}

	@Test
	public void babyHash2() {
		int pin = 0;
		int expected = 3;
		int actual = crypto.Hashing.babyHash(pin);

		assertEquals(expected, actual);
	}
}

A hash value of 3 is shown in the second test and corresponds to the simplest possible input 0. The first test already looks suspiciously similar to our reported reproducer of the bug, -17, as it also crashes the program with an arithmetic exception. After further investigation we can confirm that -9 reproduces the same problem. We divide -9 by three in the first iteration, obtaining -3 and with the update of the divisor, setting it to zero. This is crashing the program with a division by zero error – the error chain of -17 is just more complex.

#iteration in div
0 -5 -2
1 2 0

To fix this bug, we could just exchange the divisor and input at the division operation: in = div / in;. This way it doesn’t matter if the divisor ever becomes zero. Even though this would fix the bug for -17, it instantly introduces a new one because now we have a division by zero exception in case the input is zero. Luckily our unit tests generated by Symflower would immediately notify us of this mistake since they already include a test case where in=0. A proper fix would be to explicitly catch the case where the divisor turns zero and resetting it.

if (div == 0)
	div = 3;
pin = pin / div;

Let’s generate the unit tests one more time with Symflower:

public class HashingSymflowerTest {
	@Test
	public void babyHash1() {
		int pin = -15;
		int expected = 3;
		int actual = crypto.Hashing.babyHash(pin);

		assertEquals(expected, actual);
	}

	@Test
	public void babyHash2() {
		int pin = 0;
		int expected = 3;
		int actual = crypto.Hashing.babyHash(pin);

		assertEquals(expected, actual);
	}
}

Here we have again the simplest case plus a unit test now closely resembling the original bug where the division by zero happened during the first iteration. This case -15 is chosen over the -9 such that the loop is executed at least once, therefore increasing the coverage on the code snippet.

In case you’re interested in how Symflower covers your code, feel free to check out our blog post about code coverage types.

Roundup

Symflower can be extremely useful when it comes to finding simple reproducers for bugs. Additionally, Symflower automatically generated a unit test suite for the existing implementation such that we can be sure that we don’t have any unwanted behavior change after fixing a bug or even just adding a new feature.

If you’ve enjoyed this little workshop please subscribe to our newsletter to be notified of any future posts on coding, testing and new features of Symflower. Feel free to share this article with your colleagues and friends via Twitter, LinkedIn or Facebook, if you want to help them make their debugging journeys less painful and more rewarding.

Technical | 2022-04-09