Symflower's inner workings: the technology behind the curtain

Compute values for software testing using symbolic execution.

Symflower automagically computes test values to generate test cases that cover all relevant paths of your source code. Were you always wondering how Symflower comes up with these values? Then you have come to the right place! Read on to get to know the used concepts of Symflower’s inner workings in the first part of our blog series on Symflower’s Core Technology.

Symbolic Execution in a Nutshell

Symflower symbolically executes your code and checks in this manner all possible paths for interesting cases. Each of these cases results in a single generated test. But what is symbolic execution? Let’s take a look at an example.

This simple function includes a division. Hence, a possible problem of the program is a division by zero.

func f(a int, b int) int {
  x, y := 1, 0
  if a != 0 {
    y = x + 3
    if b == 0 {
      x = 2 * (a + b)
    }
  }
  return (a + b) / (x - y)
}

We’ll symbolically execute this function to examine whether there are input values that lead to a division by zero. During normal execution, the function would receive concrete integer values as input. For symbolic execution, we assume symbolic values a0 and b0 for the two input variables a and b.

When we start with the symbolic execution of function f we know nothing about a and b except that they each hold an integer.

The following picture depicts the state of the symbolic execution after we executed the first statement x, y := 1, 0.

State of symbolic execution after executing the first assignment

Note that a symbolic execution carries the set of conditions that hold on a specific path for each path it looks at. For instance when variables are assigned, the corresponding path constraints are updated. These path constraints are depicted with a dashed turquoise line in the graphics above.

Next, we execute an if statement. The path is split for statements like these as the condition of an if statement can either be true or false. Path splits are marked with green in the pictures. Let’s first check the false path for the division by zero, which is shown in the next graphics:

State of symbolic execution after executing the `false` path

The next statement of the false path of if a != 0 is already the division, which we want to check for the division by zero. This problem is triggered when x-y equals 0. Hence, we add this as an assertion to the set of path constraints. At the heart of a symbolic execution there lies a mathematical solver that is used to determine if the path constraints can be satisfied and with which values. The request to the constraint solver is marked with dark gray in the graphics. The answer from the solver, depicted with pink, tells us that there is no solution for the input values under these constraints, meaning that there is no assignment of the input variables that actually leads to a division by zero on this path.

Let’s next investigate the true path of if a != 0. The following graphics depicts the symbolic execution after looking at the statement y = x + 3. Note how the path constraints are updated accordingly.

State of symbolic execution after executing the assignment in the `true` path

Next, we reach another if statement, which again requires us to split the path once more. Let’s first explore the false path:

State of symbolic execution after executing the second `false` path

Once again the solver informs us that there is no solution for the constraints we have gathered for this path. In the affirmative case of the second if condition however we finally find values for a and b, which are shown below:

State of symbolic execution after the execution is finished

When calling f with these values, we actually observe a division by zero. Hence, we successfully applied symbolic execution to reveal a bug. The example might be small but we can use the same procedure for large-sized programs.

In summary, symbolic execution is a method that computes input values that trigger certain parts to be executed, e.g. for source code. Symflower utilizes symbolic execution to find execution paths through functions that correspond to possible program outcomes, gather constraints for each path and let the solver determine the input and output values to generate a broad spectrum of high-coverage tests for the source code in question. Analyzing software with program support is much more suited for the task than manual analysis. Symbolic execution is basically what a developer would perform in their head in order to find test values but more accurate than humans could ever be and significantly faster.

Try Symflower yourself!

Interested in what other test cases Symflower produces on our small example above? Download Symflower for free at get.symflower.com and generate all unit tests automagically with one command yourself.

We’d love to hear your feedback by using our community issue tracker https://github.com/symflower/symflower/issues or by dropping us a line at hello@symflower.com !

Constraint Solving

Now that you know how the method of symbolic execution works, the next question in our quest to generate test cases pops up: How did we obtain the results for the individual paths? One way is to use a satisfiability modulo theories (SMT) solver as an underlying mathematical solver to figure out a solution for the provided constraints. To get the desired values, we need to model our constraints in the SMT-LIB language and feed them to a solver such as the Z3 theorem prover by Microsoft.

Check out our blog series to learn about SMT, solvers and how to model mathematical problems for a theorem prover with our introductory blog post of this series and part two that models a well-known riddle as example to get hands-on experience.

Let’s see how the path constraints we collected in our running example are modelled as an SMT problem. For this purpose, let’s pick the path that hits the division by zero.

We are assuming that we are working on a 64-bit system, which results in Go integers having a size of 64 bit. Therefore, we choose fixed-size bitvectors of size 64 to represent the integers in our example. The type is written as (_ BitVec 64) and contains for instance the predicates bvadd, bvsub and bvmul for addition, subtraction and multiplication. We use the predicate get-value in the end to extract the desired values from the computed model. For simplicity, the request could be omitted of course.

; Declare constants for the input variables "a" and "b".
(declare-const a_0 (_ BitVec 64))
(declare-const b_0 (_ BitVec 64))

; Assert the initial assignments of "x" and "y".
(declare-const x_0 (_ BitVec 64))
(assert (= x_0 #x0000000000000001))
(declare-const y_0 (_ BitVec 64))
(assert (= y_0 #x0000000000000000))

; Ensure that the first if condition is true and perform the reassignment of "y".
(assert (not (= a_0 #x0000000000000000)))
(declare-const y_1 (_ BitVec 64))
(assert (= y_1 (bvadd x_0 #x0000000000000003)))

; Ensure that the second if condition is true and perform the reassignment of "x".
(assert (= b_0 #x0000000000000000))
(declare-const x_1 (_ BitVec 64))
(assert (= x_1 (bvmul #x0000000000000002 (bvadd a_0 b_0))))

; Assert that the divisor is zero to obtain a division by zero.
(declare-const div_0 (_ BitVec 64))
(assert (= div_0 (bvsub x_1 y_1)))
(assert (= div_0 #x0000000000000000))

(check-sat)
(get-value (a_0 b_0))
(get-model)

Please note that x_0 and y_0 are constants and not variables, meaning only a single value can be assigned to them. Hence, new constants x_1 and y_1 are necessary when the values of the Go variables x and y change. Check out our new blog post of this series about static single assignment form to learn more about this kind of versioning and its benefits.

See for yourself and try the above code in a Z3 online version. You will see that the output yields the values 2 for a and 0 for b.

Conclusion

This blog post of the Symflower’s Core Technology blog series showed the method of symbolic execution on a real-world example. We looked in depth how the method is applied by hand and then picked one of the execution paths of the example to be encoded in SMT-LIB, a language to model constraints. At Symflower we are using symbolic execution as the base of our technology. However, this is just one part of what is needed to generate high-coverage unit tests for real-world source code. In the next blog post of this series we learn why symbolic execution is the premier class of program analysis and how it stands out from competing technologies. Sign up for our newsletter to not miss out on the next post of this series!

Since you have read so far, you are definitely interested in the challenges that arise from autonomous software testing. Join us at Symflower by sending your CV to you@symflower.com .

At Symflower we develop a software that utilizes intelligent reasoning to generate unit tests fully autonomously. This means that developers are automagically supplied with high-coverage unit tests. Hence, they can concentrate on fixing found bugs and validating their algorithms instead of figuring out which tests are needed to reach an acceptable coverage. In that manner, Symflower raises a developer’s productivity and increases software quality to an unprecedented level. Try it yourself: Symflower is available for free at get.symflower.com.

Technology | 2021-10-07