In this post, we evaluate test template generation with Symflower in real-life, large-scale Spring Boot projects. See how we tested Symflower on 10 applications and check out the results.
Developing tests for applications involves repetitive tasks such as importing testing frameworks, adding annotations, and initializing objects. Writing all this boilerplate code can be cumbersome and time-consuming. To lift this burden, Symflower offers a solution: automatically generating boilerplate code for tests. Using Symflower, you can focus on creating specific test scenarios and skip the repetitive hassle of writing setup code.
Spring Boot examples for test generation
Writing boilerplate code is a productivity drainer for any developer. We wanted to make sure we tackled this problem for as many of us as possible. That’s why we decided to focus on Spring Boot, the most popular application framework out there.
The intricacies of testing Spring Boot applications pose unique challenges. Specific frameworks, configuration nuances, and the integration of external dependencies can complicate the task of writing boilerplate code for Spring Boot applications.
In this blog post, we’re providing fellow developers with insights that will help you decide whether generating test templates for Spring Boot is worth your while. Rather than showcasing Symflower’s test template generation on simple examples, this time we delve into its application in some real-world scenarios of Spring Boot applications.
This evaluation examines Symflower’s performance and effectiveness in large-scale projects. That should help you understand the factors that influence Symflower’s suitability for Spring Boot applications. The empirical data and practical examples herein will help you assess the potential benefits of integrating Symflower into your testing workflow (which, by the way, is very easily done whether you’re using an IDE or CLI).
Methodology: evaluating generated test templates for Spring Boot
Selecting target applications
The target applications were chosen from the [curated list of Spring Boot projects on GitHub](https://github.com/topics/spring-boot?l=java). The list was filtered to include projects written in Java and sorted by the number of stars they have. Within this pool, the following criteria guided the selection process:
- No demos or tutorials: We excluded demo and tutorial examples from our selection, as they may use code that is shorter and simpler than code that is used in daily work.
- Usage of Spring Boot: Selected projects must utilize Spring Boot as a library rather than serving as utilities or tools for Spring Boot development.
- Full application scope: We selected projects offering a complete application, including a user interface, as opposed to being solely libraries or frameworks.
- English documentation: Projects were only included if they offered documentation in English for ease of understanding and analysis.
- Active repository: Only actively maintained repositories were chosen, ensuring relevance and ongoing development activity. We selected the top 10 most-starred projects fitting all our criteria for our experiments below.
Evaluation approach
With this experiment, we aim to evaluate Symflower’s Test Template Generation for Spring Boot applications executed with symflower unit-test-skeletons
. We integrated code to collect data during generation and executed it on our selected projects. The following key metrics were gathered:
- Number of Java source files: Similar to the number of symbols, the number of Java source files in a Spring Boot application provides an insight into the complexity and scale.
- Number of symbols: The total number of symbols (i.e. methods) analyzed during the generation process provides context for the complexity and scale of the Spring Boot applications under evaluation.
- Number of unit test files: This metric represents the total number of unit test files generated by Symflower, to provide the proportion of Java files for which we could generate test templates.
- Number of generated test templates: The number of generated test templates offers an indication of the comprehensiveness and coverage of the generated tests. This allows us to assess the breadth of test coverage supported by Symflower.
- Number of problems: The count of problems encountered during the generation process shows the potential challenges or limitations faced by Symflower in generating test templates for Spring Boot applications. As an example of such problems, they may report imports or types that are unknown to Symflower, but also internal errors that came up during the generation process. Problems encountered during the process do not necessarily prevent Symflower from generating unit tests.
- Source files with Test files: This metric provides an assessment of the number of test files relative to the source files in the Spring Boot projects. It highlights the amount of source files with code that Symflower can provide tests for.
- Time taken for generation: This metric provides insights into the efficiency of the generation process. It helps gauge the practical feasibility of using Symflower to generate test templates for Spring Boot applications.
- Time taken per generated test: This metric divides the overall time taken by the number of generated tests, giving us an insight into how quickly Symflower can automatically create test templates for any application scale.
- Testable symbols with test templates: Calculating the ratio of the number of generated test templates to the number of testable symbols (methods) analyzed gives a quick overview of the percentage of symbol coverage within the generated test templates.
By systematically collecting and analyzing these data points, we aim to provide comprehensive insights into the performance and effectiveness of Symflower’s test template generation for Spring Boot applications. Replicate the results of this evaluation by running
symflower unit-test-skeletons --code-disable-fetch-dependencies --memory-limit=0
to generate unit test templates for SpringBoot applications with Symflower. Feel free to try it with any up-to-date installation of the Symflower CLI and the cloned Git repositories.
ℹ️ Logging testing data
The execution on our side was instrumented and automated through the “test-corpus” to be able to collect more information on the generation. The test-corpus persists the results of the instrumented execution, allowing us to access and analyze the data.
Results: generating test templates for Spring Boot with Symflower
We first targeted each repository without any further configuration or arguments towards Symflower. This was successful for 4 of the 10 selected repositories.
The remaining 6 repositories require their submodules and subprojects to be explicitly passed to Symflower as filters. This can be quite tedious in extreme cases like Conductor, which consists of 35 Gradle subprojects.
However, it can still be replicated with Symflower CLI. For example, the command to run at the root of the “Pacbot” repository would be the following:
symflower unit-test-skeletons --code-disable-fetch-dependencies --memory-limit=0 commons api jobs webapp
After the correct configuration, test template generation could be run for each of the selected repositories. Let’s see all the results:
Repository | Nr of Java Source Files | Nr of Symbols detected | Nr of Unit Test Files created | Nr of generated Unit Test Templates | Nr of Problems | Sourcefiles with Testfiles | Time taken for Generation | Time taken per Test Case (in seconds) | Detected testable symbols with test templates |
---|---|---|---|---|---|---|---|---|---|
Apollo | 487 | 2850 | 433 | 2778 | 5971 | 88.91% | 34m | 0.734 | 97% |
Conductor | 532 | 5412 | 475 | 5146 | 11032 | 89.29% | 11m15s | 0.131 | 95% |
Poli | 63 | 428 | 62 | 428 | 448 | 98.41% | 4s | 0.009 | 100% |
Broadleaf | 2967 | 22201 | 2047 | 16710 | 21485 | 68.99% | 37s | 0.002 | 75% |
Metasfresh | 20875 | 204018 | 15912 | 136969 | 173010 | 76.23% | 9m35s | 0.004 | 67% |
Pacbot | 903 | 5778 | 781 | 5289 | 14128 | 86.49% | 69m50s | 0.792 | 92% |
Hydralab | 364 | 1864 | 299 | 1721 | 4489 | 82.14% | 3m50s | 0.134 | 92% |
Abixen | 436 | 1669 | 346 | 1610 | 4200 | 79.36% | 30m | 1.118 | 96% |
Kafka Sprout | 11 | 40 | 11 | 40 | 68 | 100% | 4s | 0.100 | 100% |
Alovoa | 481 | 144 | 481 | 101 | 1016 | 70.14% | 5s | 0.050 | 100% |
The best results in terms of coverage of the detected symbols were generated for the applications Poli, Kafka Sprout and Alovoa. However, only Kafka Sprout got a unit test file for every single Java source file. For Poli and Alovoa, this means that not every Java source file contains testable symbols. For example, there are a lot of entity classes defined in the Alovoa project source code. Those classes only contain private fields and therefore have no testable code.
Most of the other applications we tested did however yield similarly high ratios between symbols and generated tests. So good test coverage is not limited to smaller-scale applications, Symflower can generate test templates for large Spring Boot projects.
Does test template generation work?
How well does Symflower do in generating test templates?
Symflower generated test templates for the majority of the detected symbols (67% or higher). In most cases, test templates were generated for 92% of the symbols (methods) or more. The longest it took Symflower to achieve that was just over a second per test template. In most cases, it took less than 1 second to generate a test template.
Analyzing large-scale input of over 200,000 methods, Symflower was able to cover two-thirds of symbols (methods) with test templates.
With all of the tested applications, Symflower yielded decent results. All in all, this goes to show that Symflower can be practically applied in a large-scale Spring Boot project to generate test templates. Even with complex code, it provides a fast and simple way to generate thousands of test templates to reach higher coverage.
Who should use Symflower to generate test templates?
In general, you’ll use a single CLI command to trigger Symflower to generate test templates for Java applications using Spring Boot. If your application consists of multiple modules, you’ll have to pass additional flags with that command to generate tests for the whole application in one step.
You can further optimize the results with knowledge of the application. For example, by providing the correct test framework with e.g. --java-test-framework=JUnit5
, Symflower will skip the automatic detection of the test framework and use JUnit 5 directly, making the process even faster.
What else can be taken away from these results?
As the data shows, execution time is not really tied to the number of symbols, the number of files within an application, or even how much output can be produced, but more to how complex the code is for Symflower to process.
The time it took to generate test templates using Symflower varied from a few seconds to just over an hour. The results should give you an idea of how fast or slow it can be to generate test templates with Symflower. Stack that up against the time you normally spend typing up boilerplate code, and see if you can beat the machine (or find a better use of your time).
As from our side, we appreciate these insights into what kind of user code is still harder for Symflower to process. This will help us nail down what problems are limiting test template generation for our Spring Boot users and will help us improve Symflower to generate better tests faster.