While the concept of symbolic execution (SE) was introduced in the 1970s, it has mostly been relevant for scientists in the decades since. In essence, symbolic execution provides a way to analyze a piece of software to identify the inputs that lead to the execution of each and every path of the program.
Ongoing research into the field of symbolic execution has enabled the practical application of this awesome technology. By letting us explore all possible paths of software behavior in an efficient way, SE is greatly helpful when it comes to testing software. That’s the basic idea that Symflower is built on!
💡 Check out our core technology series
Symflower’s application of symbolic execution
Symflower automagically computes test values to generate test cases that cover all relevant paths of your source code. That way, we can achieve optimal unit test coverage to ensure your software functions correctly. All that is done automatically, without you having to write a single unit test. In this first part of our blog series on Symflower’s Core Technology, we peek under the hood to explore the inner workings of Symflower.
That’s great news: we’re hiring!
Symbolic Execution in a Nutshell
Symflower symbolically executes your code and checks in this manner all possible paths for interesting test cases. Each of these cases results in a single generated test. But how exactly does symbolic execution help Symflower do that? Let’s take a look at an example.
This simple function includes a division. Hence, a possible problem of the program is a division by zero.
func f(a int, b int) int {
x, y := 1, 0
if a != 0 {
y = x + 3
if b == 0 {
x = 2 * (a + b)
}
}
return (a + b) / (x - y)
}
We’ll symbolically execute this function to examine whether there are input values that lead to a division by zero. During normal execution, the function would receive concrete integer values as input. For symbolic execution, we assume symbolic values a0 and b0 for the two input variables a
and b
.
When we start with the symbolic execution of function f
we know nothing about a
and b
except that they each hold an integer.
The following picture depicts the state of the symbolic execution after we executed the first statement x, y := 1, 0
.
Note that a symbolic execution carries the set of conditions that hold on a specific path for each path it looks at. For instance, when variables are assigned, the corresponding path constraints are updated. These path constraints are depicted with a dashed turquoise line in the graphics above.
Next, we execute an if
statement. The path is split for statements like these as the condition of an if
statement can either be true
or false
. Path splits are marked with green in the pictures. Let’s first check the false
path for the division by zero, which is shown in the next graphics:
The next statement of the false
path of if a != 0
is already the division, which we want to check for the division by zero. This problem is triggered when x-y
equals 0. Hence, we add this as an assertion to the set of path constraints. At the heart of a symbolic execution there lies a mathematical solver that is used to determine if the path constraints can be satisfied and with which values. The request to the constraint solver is marked with dark gray in the graphics. The answer from the solver, depicted with pink, tells us that there is no solution for the input values under these constraints, meaning that there is no assignment of the input variables that actually leads to a division by zero on this path.
Let’s next investigate the true
path of if a != 0
. The following graphics depicts the symbolic execution after looking at the statement y = x + 3
. Note how the path constraints are updated accordingly.
Next, we reach another if
statement, which again requires us to split the path once more. Let’s first explore the false
path:
Once again the solver informs us that there is no solution for the constraints we have gathered for this path. In the affirmative case of the second if
condition however we finally find values for a
and b
, which are shown below:
When calling f
with these values, we actually observe a division by zero. Hence, we successfully applied symbolic execution to reveal a bug. The example might be small but we can use the same procedure for large-sized programs.
In summary, symbolic execution is a method that computes input values that trigger certain parts to be executed, e.g. for source code. Symflower utilizes symbolic execution to find execution paths through functions, gather constraints for each path, and let the constraint solver determine the input and output values to generate a broad spectrum of high-coverage tests for the source code in question. Analyzing software with program support is much more suited for the task than manual analysis. Symbolic execution is basically what a developer would perform in their head in order to find test values but more accurate than humans could ever be and significantly faster.
Try Symflower yourself!
Interested in what other test cases Symflower produces on our small example above? Download Symflower for free at get.symflower.com and generate all unit tests automagically with one command yourself.
We’d love to hear your feedback by using our community issue tracker https://github.com/symflower/symflower/issues or by dropping us a line at hello@symflower.com !
Constraint Solving
Now that you know how the method of symbolic execution works, the next question in our quest to generate test cases pops up: How did we obtain the results for the individual paths? One way is to use a satisfiability modulo theories (SMT) solver as an underlying mathematical solver to figure out a solution for the provided constraints. To get the desired values, we need to model our constraints in the SMT-LIB language and feed them to a solver such as the Z3 theorem prover by Microsoft.
Check out our blog series to learn about SMT, solvers and how to model mathematical problems for a theorem prover with our introductory blog post of this series and part two that models a well-known riddle as example to get hands-on experience.
Let’s see how the path constraints we collected in our running example are modelled as an SMT problem. For this purpose, let’s pick the path that hits the division by zero.
We are assuming that we are working on a 64-bit system, which results in Go integers having a size of 64 bit. Therefore, we choose fixed-size bitvectors of size 64 to represent the integers in our example. The type is written as (_ BitVec 64)
and contains for instance the predicates bvadd
, bvsub
and bvmul
for addition, subtraction and multiplication. We use the predicate get-value
in the end to extract the desired values from the computed model. For simplicity, the request could be omitted of course.
; Declare constants for the input variables "a" and "b".
(declare-const a_0 (_ BitVec 64))
(declare-const b_0 (_ BitVec 64))
; Assert the initial assignments of "x" and "y".
(declare-const x_0 (_ BitVec 64))
(assert (= x_0 #x0000000000000001))
(declare-const y_0 (_ BitVec 64))
(assert (= y_0 #x0000000000000000))
; Ensure that the first if condition is true and perform the reassignment of "y".
(assert (not (= a_0 #x0000000000000000)))
(declare-const y_1 (_ BitVec 64))
(assert (= y_1 (bvadd x_0 #x0000000000000003)))
; Ensure that the second if condition is true and perform the reassignment of "x".
(assert (= b_0 #x0000000000000000))
(declare-const x_1 (_ BitVec 64))
(assert (= x_1 (bvmul #x0000000000000002 (bvadd a_0 b_0))))
; Assert that the divisor is zero to obtain a division by zero.
(declare-const div_0 (_ BitVec 64))
(assert (= div_0 (bvsub x_1 y_1)))
(assert (= div_0 #x0000000000000000))
(check-sat)
(get-value (a_0 b_0))
(get-model)
Please note that x_0
and y_0
are constants and not variables, meaning only a single value can be assigned to them. Hence, new constants x_1
and y_1
are necessary when the values of the Go variables x
and y
change. Check out our next blog post of this series about static single assignment form to learn more about this kind of versioning and its benefits.
See for yourself and try the above code in a Z3 online version. You will see that the output yields the values 2 for a
and 0 for b
.
Conclusion
This blog post of the Symflower’s Core Technology blog series showed the method of symbolic execution on a real-world example. We looked in-depth at how the method is applied by hand and then picked one of the execution paths of the example to be encoded in SMT-LIB, a language to model constraints. At Symflower, we are using symbolic execution as the base of our technology. However, this is just one part of what is needed to generate high-coverage unit tests for real-world source code. Read our latest blog post of this series, where we learn why symbolic execution is the premier class of program analysis and how it stands out from competing technologies.
Ready to check out Symflower in action? Try the tool for yourself by installing the plugin for VSCode, IntelliJ, or GoLand, or access Symflower in your terminal and CI!
Sign up for our newsletter to get notified of our new blog posts about programming, Symflower and automated unit test generation. Follow us on Twitter, LinkedIn or Facebook for more insightful content.
At Symflower we develop a software that utilizes intelligent reasoning to generate unit tests fully autonomously. This means that developers are automagically supplied with high-coverage unit tests. Hence, they can concentrate on fixing found bugs and validating their algorithms instead of figuring out which tests are needed to reach an acceptable coverage. In that manner, Symflower raises a developer’s productivity and increases software quality to an unprecedented level. Try it yourself: Symflower is available for free at get.symflower.com.