How to reproduce and find fixes for reported software errors?

How to reproduce user-reported errors and verify fixes using Git bisect

This post walks you through our approach to addressing reported software problems with Symflower. Check out this blog post about categorizing software problems with stack traces and code diffs to learn about our comprehensive problem analysis process. In this post, we’ll focus on two essential steps:

Reproducing the problem: We will talk about the process of reproducing a reported problem in the difficult domain of program analysis.
Locating the fix with Git bisect: If the problem no longer exists in the current version, we want to make sure it was fixed. We will demonstrate how to use git bisect to pinpoint the commit where the reported problem was resolved. Let’s dive into these steps, showing you how we tackle problem reports with our own Java test generation product Symflower.

Generate Java unit test code with ease

Writing tests doesn’t have to be a productivity drainer. Symflower generates ready-to-use unit test templates so you can get right to specifying your test scenarios. Bonus: our beta feature lets you generate full test suites with values and automatically maintains them. Symflower works in your IDE with JUnit 4 and JUnit 5 for Java, Spring, and Spring Boot applications.

Reproducing software problems

Analyzing source code is a tough nut to crack because of the complexity involved in analyzing various projects and considering the vast range of possibilities that different programming languages introduce. This complexity makes reproducing reported problems a challenge.

Here’s how we tackle this challenging task:

Executing Symflower on a pool of projects: To efficiently reproduce reported problems, we execute Symflower (the version specified in the problem report) on a curated pool of open-source projects.
Extracting encountered problems: During the execution process, we record any encountered errors or panics that may occur.
Matching problems with reported issues: Once the execution is complete, we compare the encountered problems with the issues reported by users.
Filtering exact inputs: Next, we dive deeper to pinpoint the specific parts of the input responsible for triggering the problem. This helps us better understand what went wrong and extract a reliable reproducer.

For this example, we’re using a program that parses students from a CSV file. We have reports that in some versions of our program, users encountered a panic:

panic: runtime error: index out of range [2] with length 2

goroutine 1 [running]:
main.parseCSV({0x7fff3df673b7?, 0x7f5a5878fb08?})
       src/students/students.go:63 +0x4b7
main.main()
       src/students/students.go:82 +0x3f
exit status 2

This panic was reported for the version v1.1 of the program. So we check this version out in the Git repository and try different input data (out of a pool of test data) to see if we can reproduce the panic.

git checkout v1.1
go run src/students/students.go students.csv

Here are the contents of studends.csv:

ID,Name,Age
1,Smith,21
2,Jones,22
3,Brown,21
4,Davies,24
5,Williams,21

This simply prints out the data from the CSV file and does not show the reported panic.

ID: 1, Name: Smith, Age: 21
ID: 2, Name: Jones, Age: 22
ID: 3, Name: Brown, Age: 21
ID: 4, Name: Davies, Age: 24
ID: 5, Name: Williams, Age: 21

We have to keep looking. Let’s try to reproduce the problem by trying out different CSV files:

go run src/students/students.go studentsInvalidAge.csv

Here’s the data in studentsInvalidAge.csv:

ID,Name,Age
1,Smith,21Years
2,Jones,22
3,Brown,21
4,Davies,24
5,Williams,21

By running the program with this data, we can produce the following output. This shows a panic, but it’s different from the one that was reported. This one happens because the age of students has to be a number:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x482a3f]

goroutine 1 [running]:
main.parseCSV({0x7fffd2d783bf?, 0x7fe3d3f83b08?})
       src/students/students.go:63 +0x3bf
main.main()
       src/students/students.go:82 +0x3f
exit status 2

This points to another problem with the program that should be addressed. We will record this problem to check later if it still occurs with the current version. With this, we haven’t arrived at our reproducer, though it’s still nice to have discovered a piece of input that produces a panic. But we have to keep on searching to reproduce the exact problem reported.

By running the program with another file as input, we can reproduce the same panic we’re looking for. Hooray!

go run src/students/students.go studentsMissingAge.csv

This is what we’ll find in studentsMissingAge.csv:

ID,Name
1,Smith
2,Jones
3,Brown
4,Davies
5,Williams

The CSV file used here is missing the age column in the data and so when parsing the file, it produces the index out of range error. Otherwise, it is equal to the working student data and it can be used as a reproducer.

In this simple example, we managed to find the correct data to reproduce the panic fairly easily, also using the error message of the reported panic. However, when dealing with more complex programs and inputs like large software projects, analyzing this problem could have been exponentially more difficult.

Next, we want to find out if we already have a fix for this problem in the current version of our program.

Find fixes in current version

To address reported panics that no longer exist in the current version of our program, we use a systematic approach that relies on Git and testing scripts. Here’s how we do it:

1. Verify that the panic no longer exists in the current version

First, we switch to the latest version of our program and run it with the input data that previously triggered the panic to ensure that the issue is now handled gracefully, producing an error message instead of a panic:

git checkout master
go run src/students/students.go studentsMissingAge.csv

ERROR: Internal error: Line does not have the expected number of columns.:
Stacktrace: goroutine 1 [running]:
main.NewInternalError({0x4be258?, 0xc000030050})
       src/students/students.go:20 +0x49
main.parseCSV({0x7fffdd23e3b8?, 0x100000000491e80?})
       src/students/students.go:61 +0x4cb
main.main()
       src/students/students.go:91 +0x3f

exit status 1

The program now produces an error message without panicking. This indicates that the issue has been fixed. But we need to further verify this and store a reference to the fix with the reported problem in case it is reported again in the future.

2. Use git bisect to find the fix

To identify the commit that fixed the panic, we use the git bisect command. This command performs a binary search through the commit history and is commonly used to pinpoint the commit that introduced a bug. However, we can also use it to find the commit that fixed a problem.

Below, we see the commit history of our program. We know that for v1.1 we can produce the panic, but for the current version, we cannot. git bisect helps us pinpoint the exact commit that changed that behavior:

* 536a8a9 (HEAD -> master, tag: v1.5) Avoid panic in main function
* 217035d (tag: v1.4) Check column numbers to avoid panic
* b1479f2 (tag: v1.3) Validate that the ID conforms to our standards
* 80be32f (tag: v1.2, avoid-age-NAN) Improve error handling for age parsing
* f200542 Avoid nil pointer dereference for student age
* fceb8a3 (tag: v1.1, parse-student-age) Parse student age
* 160f558 (tag: v1.0) Initial commit for students CSV parser.

To use git bisect, we need a test script to determine if the panic still exists or not. So we create the script reproduce.go which executes the program with the problematic input data and checks for the panic message:

package main

import (
	"bytes"
	"os"
	"os/exec"
	"strings"
)

func main() {
	cmd := exec.Command("go", "run", "src/students/students.go", "studentsMissingAge.csv")
	buf := new(bytes.Buffer)
	cmd.Stdout = buf
	cmd.Stderr = buf
	cmd.Run()

	if !strings.Contains(string(buf.Bytes()), "panic: runtime error: index out of range [2] with length 2") {
		os.Exit(1)
	}
}

The script should produce an exit code unequal to zero if we cannot reproduce the panic, and end normally if we can. The exit code allows git bisect to distinguish the behavior.

3. Perform Git bisect

We need to be able to use our script over all commits in the Git log history. If it is checked in as part of the repository, it would also change over the versions. So we first build it to have a consistent test script:

go build reproduce.go

To run git bisect we now perform the following steps:

git checkout master # Checkout version without the problem.
git bisect start

./reproduce         # Should not reproduce the panic.
echo $?             # Last exit code should be != 0.
git bisect bad      # Mark the current commit as bad.

git checkout v1.1   # Checkout version with the problem.
./reproduce         # Should reproduce the panic.
echo $?             # Last exit code should be == 0.
git bisect good     # Mark the current commit as good.

git bisect run ./reproduce

The output of running this contains the following line:

217035d00316d3893e4a9221ef82f3150f078295 is the first bad commit

That’s it! That should be the commit that fixes the problem.

We can now stop bisecting by executing git bisect reset.

4. Verify the fix

We should always check to make sure that the commit found with git bisect really does fix the problem. Looking at the commit, we find that it introduced code that fixes the panic by checking if the line in the CSV file has the right number of columns. It performs the correct error handling in case there are too few columns.

To double-check, we can manually verify that we cannot reproduce the panic with the commit:

git checkout 217035d00316d3893e4a9221ef82f3150f078295
go run src/students/students.go studentsMissingAge.csv
git checkout b1479f2   # Checkout the commit before the fix.
go run src/students/students.go studentsMissingAge.csv

The found commit no longer produces the reported panic, but the commit before it still does. This verifies that the commit fixes the panic.

⚠️ Challenges with this problem-finding process

If you want to use this process to find fixes, you’ll need a reproducer that can reliably identify the problem over all versions. If there is, for instance, a commit that cannot produce the problem, despite it still existing with the input data used, it will lead git bisect to falsely detect it as the fix.

OK, back to our example. Let’s say we call reproduce.go with different data to produce the panic: studentsInvalidId.csv. Like with the input data above, this misses the age but also contains different student IDs:

ID,Name
#1,Smith
#2,Jones
#3,Brown
#4,Davies
#5,Williams

We can still reproduce the panic in v1.1 but not in the current version. The error messages in these two versions are exactly the same as with the data used above. However, if we now run git bisect with this data, we get a different commit as a result:

b1479f2648e816159b5e5c807dd695dcb69454cb is the first bad commit

We can manually execute this version with the input data and confirm that the panic is not triggered:

git checkout b1479f2648e816159b5e5c807dd695dcb69454cb
go run src/students/students.go studentsInvalidId.csv

panic: Internal error: Not a valid ID: #1.:
Stacktrace: goroutine 1 [running]:
main.NewInternalError({0x4ebaf8, 0xc00009e060})
       /home/stefan/students/students.go:21 +0x45
main.parseCSV({0x7ffc22082f22?, 0xc0000a4ee8?})
       /home/stefan/students/students.go:62 +0x5b8
main.main()
       /home/stefan/students/students.go:92 +0x3f


goroutine 1 [running]:
main.main()
       /home/stefan/students/students.go:94 +0x14b
exit status 2

This commit does not fix the panic. However, it introduces a check for student ID, which is violated with this input data. Because this check happens earlier in the execution than retrieving the age, it stops the panic from being triggered.

This simple example highlights the importance of finding input data that only triggers the problem being analyzed. Using a simple CSV file in our example makes this seem trivial, but in more complex applications, analyzing convoluted data can become quite a challenge.

Room for improvement

We use this process at Symflower to address the problems reported by our users, and we are reasonably happy with it. However, there is still potential for further improvement.

For instance, it would be possible to automate the identification of problem reproducers and to catalog all problems found through executing Symflower on open-source projects. Finding ways to automatically (and reliably) extract code that reproduces problems is an open challenge. Solving this could greatly improve the efficiency of our problem-fixing process. This would help make sure that for all commits along the history since the reported version, only the reported problem was produced, and no other errors that could lead to problems no longer appearing.

By applying our process for analyzing reported problems and finding reproducers, we found that our approach for comparing stack traces still leads to duplicates in our problem storage. Differences in stack traces between versions can come from changes in how Symflower is calling the analysis or in the error handling, and not any changes in the actual analysis code.

Improving this would help in multiple ways. First, it reduces the number of problems that our developers need to address. Second, it could be more reliably used to find reproducers that can be used across different versions. Finally, we could further automate identifying fixes by being able to compare stack traces over versions more reliably.

^160f558 (  1) package main
^160f558 (  2)  
^160f558 (  3) import (
^160f558 (  4)     "encoding/csv"
^160f558 (  5)     "fmt"
^160f558 (  6)     "os"
b1479f26 (  7)     "regexp"
^160f558 (  8)     "runtime"
fceb8a33 (  9)     "strconv"
^160f558 ( 10) )
^160f558 ( 11)  
^160f558 ( 12) // InternalError represents an internal error.
^160f558 ( 13) type InternalError struct {
^160f558 ( 14)     Err        error
^160f558 ( 15)     StackTrace string
^160f558 ( 16) }
^160f558 ( 17)  
^160f558 ( 18) // NewInternalError instantiates a new internal error.
^160f558 ( 19) func NewInternalError(err error) *InternalError {
^160f558 ( 20)     buf := make([]byte, 1<<16)
^160f558 ( 21)     stackSize := runtime.Stack(buf, false)
^160f558 ( 22)  
^160f558 ( 23)     return &InternalError{
^160f558 ( 24)         Err:        err,
^160f558 ( 25)         StackTrace: string(buf[:stackSize]),
^160f558 ( 26)     }
^160f558 ( 27) }
^160f558 ( 28)  
^160f558 ( 29) // Error returns a string representation of the error.
^160f558 ( 30) func (e *InternalError) Error() string {
^160f558 ( 31)     return fmt.Sprintf("Internal error: %s:\nStacktrace: %v", e.Err, e.StackTrace)
^160f558 ( 32) }
^160f558 ( 33)  
^160f558 ( 34) // Student struct to represent each student's information.
^160f558 ( 35) type Student struct {
^160f558 ( 36)     ID   string
^160f558 ( 37)     Name string
fceb8a33 ( 38)     Age  int
^160f558 ( 39) }
^160f558 ( 40)  
^160f558 ( 41) // parseCSV parses the CSV file and returns a slice of Student structs.
^160f558 ( 42) func parseCSV(filePath string) (students []*Student, err error) {
^160f558 ( 43)     file, err := os.Open(filePath)
^160f558 ( 44)     if err != nil {
^160f558 ( 45)         return nil, NewInternalError(err)
^160f558 ( 46)     }
^160f558 ( 47)     defer file.Close()
^160f558 ( 48)  
^160f558 ( 49)     reader := csv.NewReader(file)
^160f558 ( 50)     reader.Comma = ','
^160f558 ( 51)     lines, err := reader.ReadAll()
^160f558 ( 52)     if err != nil {
^160f558 ( 53)         return nil, NewInternalError(err)
^160f558 ( 54)     }
^160f558 ( 55)  
^160f558 ( 56)     for i, line := range lines {
^160f558 ( 57)         if i == 0 {
^160f558 ( 58)             continue
^160f558 ( 59)         }
^160f558 ( 60)  
217035d0 ( 61)         if len(line) < 3 {
217035d0 ( 62)             return nil, NewInternalError(fmt.Errorf("Line does not have the expected number of columns."))
217035d0 ( 63)         }
217035d0 ( 64)  
b1479f26 ( 65)         if !regexp.MustCompile(`^[a-zA-Z0-9]+$`).MatchString(line[0]) {
b1479f26 ( 66)             return nil, NewInternalError(fmt.Errorf("Not a valid ID: %s.", line[0]))
b1479f26 ( 67)         }
b1479f26 ( 68)  
80be32f8 ( 69)         age, err := parseAge(line[2])
80be32f8 ( 70)         if err != nil {
80be32f8 ( 71)             return nil, err
80be32f8 ( 72)         }
80be32f8 ( 73)  
^160f558 ( 74)         students = append(students, &Student{
^160f558 ( 75)             ID:   line[0],
^160f558 ( 76)             Name: line[1],
80be32f8 ( 77)             Age:  age,
^160f558 ( 78)         })
^160f558 ( 79)     }
^160f558 ( 80)  
^160f558 ( 81)     return students, nil
^160f558 ( 82) }
^160f558 ( 83)  
fceb8a33 ( 84) // parseAge returns the age from the string in the csv file.
80be32f8 ( 85) func parseAge(ageString string) (age int, err error) {
fceb8a33 ( 86)     a, err := strconv.Atoi(ageString)
fceb8a33 ( 87)     if err != nil {
80be32f8 ( 88)         return 0, NewInternalError(err)
fceb8a33 ( 89)     }
fceb8a33 ( 90)  
80be32f8 ( 91)     return a, nil
fceb8a33 ( 92) }
fceb8a33 ( 93)  
^160f558 ( 94) func main() {
^160f558 ( 95)     filePath := os.Args[1]
^160f558 ( 96)     students, err := parseCSV(filePath)
^160f558 ( 97)     if err != nil {
536a8a9a ( 98)         fmt.Fprintf(os.Stderr, "ERROR: %s\n", err.Error())
536a8a9a ( 99)         os.Exit(1)
^160f558 (100)     }
^160f558 (101)  
^160f558 (102)     for _, student := range students {
fceb8a33 (103)         fmt.Printf("ID: %s, Name: %s, Age: %d\n", student.ID, student.Name, student.Age)
^160f558 (104)     }
^160f558 (105) }

So that’s how we find problem reproducers and their fixes at Symflower. Have any thoughts about this process or ideas on how to improve it? Let us know in the comments on X, LinkedIn, or Facebook so we can update our practice and this post! Sign up for our newsletter for other great pieces of content from us.

Try Symflower in your IDE to generate unit test templates & test suites

| 2024-06-10