Using code dependency analysis to decide what to test =================== By [Patrick Kusebauch](https://github.com/patrickkusebauch) > [!IMPORTANT] > Find out how to save 90+% of your test runtime and resources by eliminating 90+% of your tests while keeping your test > coverage and confidence. Save over 40% of your CI pipeline runtime overall. ## Introduction Tests are expensive to run and the larger the code base the more expensive it becomes to run them all. At some point your test runtime might even become so long it will be impossible to run them all on every commit as your rate of incoming commits might be higher than your ability to test them. But how else can you have confidence that your introduced changes have not broken some existing code? Even if your situation is not that dire yet, the time it takes to run test makes it hard to get fast feedback on your changes. It might even force you to compromise on other development techniques. To lump several changes into larger commits, because there is no time to test each small individual change (like type fixing, refactoring, documentation etc.). You might like to do trunk-based development, but have feature branches instead, so that you can open PRs and test a whole slew of changes all at once. Your DORA metrics are compromised by your slow rate of development. Instead of being reactive to customer needs, you have to plan your projects and releases months in advance because that's how often you are able to fully test all the changes. Slow testing can have huge consequences on how the whole development process looks like. While speeding up test execution per-se is very individual problem in every project, there is another technique that can be applied everywhere. You have to become more picky about what tests to run. So how do you decide what to test? ## Theory ### What is code dependency analysis? Code dependency analysis is the process of (usually statically) analysing the code to determine what code is used by other code. The most common example of this is analysing the specified dependencies of a project to determine potential vulnerabilities. This is what tools like [OWASP Dependency Check](https://owasp.org/www-project-dependency-check/) do. Another use case is to generate a Software Bill of Materials (SBOM) for a project. There is one other use case that not many people talk about. That is using code dependency analysis to create a Directed Acyclic Graph (DAG) of the various components/modules/domains of a project. This DAG can then be used to determine how changes to one component will affect other components. Imagine you have a project with the following structure of components: ![Project Structure](Images/day15-01.png) The `Supportive` component depends on the `Analyser` and `OutputFormatter` components. The `Analyser` in turn depends on 3 other components - `Ast`, `Layer` and `References`. Lastly `References` depend on the `Ast` component. If you make a change to the `OutputFormatter` component you will want to run the **contract tests** for `OutputFormatter` and **integration tests** for `Supportive` but no tests for `Ast`. If you make changes to `References` you will want to run the **contract tests** for `References`, **integration tests** for `Analyser` and `Supportive` but no tests for `Layer` or `OutputFormatter`. In fact, there is no one module that you can change that would require you to run all the tests. > [!NOTE] > By **contract tests** I mean tests that test the defined API of the component. In other words what the component > promises (by contract) to the outside users to always be true about the usage of the component. Such a test mocks out > all outside interaction with any other component. > > By contrast, **integration tests** in this context mean tests that test that the interaction with a dependent > component is properly programmed. For that reason the underlying (dependent) component is not mocked out. ### How do you create the dependency DAG? There are very few tools that can do this as of today, even though the concept is very simple. So simple you can do it yourself if there is no tool available for your language of choice. You need to parse and lex the code to create an Abstract Syntax Tree (AST) and then walk the AST of every file to find the dependencies. The same functionality your IDE does any time you "Find references..." or what your language server sends over [LSP (Language Server Protocol)](https://en.wikipedia.org/wiki/Language_Server_Protocol). You group the dependencies by predefined components/modules/domains, and then combine all the dependencies into a single graph. ### How do you use the DAG to decide what to test? Once you have the DAG there is a 4-step process to run your testing: 1. Get the list of changed files (for example by running `git diff`) 2. Feed the list to the dependency analysis tool to get the list of changed components (and optionally the list of depending components as well for integration testing) 3. Feed the list to your testing tool of choice to run the test-suites corresponding to each changed component 4. Revel in how much time you have saved on testing. ## Practice This is not just some theoretical idea, but rather something you can try out yourself today. If you are lucky, there is already an open-source tool in your language of choice that lets you do it today. If you are not, the following demonstration will give you enough guidance to write it yourself. If you do, please let me know, I would love to see it. The tool that I have used today for demonstration is [deptrac](https://qossmic.github.io/deptrac/), and it is written in PHP and for PHP. All you have to do to create a DAG is to specify the modules/domains: ```yaml # deptrac.yaml deptrac: paths: - src layers: - name: Analyser collectors: - type: directory value: src/Analyser/.* - name: Ast collectors: - type: directory value: src/Ast/.* - name: Layer collectors: - type: directory value: src/Layer/.* - name: References collectors: - type: directory value: src/References/.* - name: Contract collectors: - type: directory value: src/Contract/.* ``` ### The 4-step process Once you have the DAG you can use combine it with the list of changed files to determine what modules/domains to test. A simple git command will give you the list of changed files: ```bash git diff --name-only ``` You can then use this list to find the modules/domains that have changed and then use the DAG to find the modules that depend on those modules. ```bash # to get the list of changed components git diff --name-only | xargs php deptrac.php changed-files # to get the list of changed modules with the depending components git diff --name-only | xargs php deptrac.php changed-files --with-dependencies ``` If you pick the popular PHPUnit framework for your testing and follow [their recommendation for organizing code](https://docs.phpunit.de/en/10.5/organizing-tests.html), it will be very easy for you to create a test-suite per component. To run a test for a component you just have to pass the parameter `--testsuite {componentName}` to the PHPUnit executable: ```bash git diff --name-only |\ xargs php deptrac.php changed-files |\ sed 's/;/ --testsuite /g; s/^/--testsuite /g' |\ xargs ./vendor/bin/phpunit ``` Or if you have integration test for the dependent modules, and decide to name you integration test-suites as `{componentName}Integration`: ```bash git diff --name-only |\ xargs php deptrac.php changed-files --with-dependencies |\ sed '1s/;/ --testsuite /g; 2s/;/Integration --testsuite /g; /./ { s/^/--testsuite /; 2s/$/Integration/; }' |\ sed ':a;N;$!ba;s/\n/ /g' |\ xargs ./vendor/bin/phpunit ``` ### Real life comparison results I have run the following script a set of changes to compare what the saving were: ```shell # Compare timing iterations=10 total_time_with=0 for ((i = 1; i <= $iterations; i++)); do # Run the command runtime=$( TIMEFORMAT='%R' time (./vendor/bin/phpunit >/dev/null 2>&1) 2>&1 ) miliseconds=$(echo "$runtime" | tr ',' '.') total_time_with=$(echo "$total_time_with + $miliseconds * 1000" | bc) done average_time_with=$(echo "$total_time_with / $iterations" | bc) echo "Average time (not using deptrac): $average_time_with ms" # Compare test coverage tests_with=$(./vendor/bin/phpunit | grep -oP 'OK \(\K\d+') echo "Executed tests (not using deptrac): $tests_with tests" echo "" total_time_without=0 for ((i = 1; i <= $iterations; i++)); do # Run the command runtime=$( TIMEFORMAT='%R' time ( git diff --name-only | xargs php deptrac.php changed-files --with-dependencies | sed '1s/;/ --testsuite /g; 2s/;/Integration --testsuite /g; /./ { s/^/--testsuite /; 2s/$/Integration/; }' | sed ':a;N;$!ba;s/\n/ /g' | xargs ./vendor/bin/phpunit >/dev/null 2>&1 ) 2>&1 ) miliseconds=$(echo "$runtime" | tr ',' '.') total_time_without=$(echo "$total_time_without + $miliseconds * 1000" | bc) done average_time_without=$(echo "$total_time_without / $iterations" | bc) echo "Average time (using deptrac): $average_time_without ms" tests_execution_without=$(git diff --name-only | xargs php deptrac.php changed-files --with-dependencies | sed '1s/;/ --testsuite /g; 2s/;/Integration --testsuite /g; /./ { s/^/--testsuite /; 2s/$/Integration/; }' | sed ':a;N;$!ba;s/\n/ /g' | xargs ./vendor/bin/phpunit) tests_without=$(echo "$tests_execution_without" | grep -oP 'OK \(\K\d+') tests_execution_without_time=$(echo "$tests_execution_without" | grep -oP 'Time: 00:\K\d+\.\d+') echo "Executed tests (using deptrac): $tests_without tests" execution_time=$(echo "$tests_execution_without_time * 1000" | bc | awk '{gsub(/\.?0+$/, ""); print}') echo "Time to find tests to execute (using deptrac): $(echo "$average_time_without - $tests_execution_without_time * 1000" | bc | awk '{gsub(/\.?0+$/, ""); print}') ms" echo "Time to execute tests (using deptrac): $execution_time ms" echo "" percentage=$(echo "scale=3; $tests_without / $tests_with * 100" | bc | awk '{gsub(/\.?0+$/, ""); print}') echo "Percentage of tests not needing execution given the changed files: $(echo "100 - $percentage" | bc)%" percentage=$(echo "scale=3; $execution_time / $average_time_with * 100" | bc | awk '{gsub(/\.?0+$/, ""); print}') echo "Time saved on testing: $(echo "$average_time_with - $execution_time" | bc) ms ($(echo "100 - $percentage" | bc)%)" percentage=$(echo "scale=3; $average_time_without / $average_time_with * 100" | bc | awk '{gsub(/\.?0+$/, ""); print}') echo "Time saved overall: $(echo "$average_time_with - $average_time_without" | bc) ms ($(echo "100 - $percentage" | bc)%)" ``` with the following results: ``` Average time (not using deptrac): 984 ms Executed tests (not using deptrac): 721 tests Average time (using deptrac): 559 ms Executed tests (using deptrac): 21 tests Time to find tests to execute (using deptrac): 491 ms Time to execute tests (using deptrac): 68 ms Percentage of tests not needing execution given the changed files: 97.1% Time saved on testing: 916 ms (93.1%) Time saved overall: 425 ms (43.2%) ``` Some interesting observations: - Only **3% of the tests** that normally run on the PR needed to be run to cover the change with tests. That is a **saving of 700 tests** in this case. - **Test execution time has decreased by 93%**. You are mostly left with the constant cost of set-up and tear-down of the testing framework. - **Pipeline overall time has decreased by 43%**. Since the analysis time grows orders of magnitude slower that test runtime (it is not completely constant more files still means more to statically analyse), the number is only bound to be better the larger the codebase is. And these saving apply to arguable the worst possible SUT (System Under Test): - It is a **small application**, so it is hard to get the saving of skipping testing of vast number of components as it would be the case for large codebases. - It is a **CLI script**, so it has no database, no external APIs to call, minimal slow I/O tests. Those are the tests you want skipping the most, and they are barely present here. ## Conclusion Code dependency analysis is a very useful tool for deciding what to test. It is not a silver bullet, but it can help you reduce the number of tests you run and the time it takes to run them. It can also help you decide what tests to run in your CI pipeline. It is not a replacement for a good test suite, but it can help you make your test suite more efficient. ## References - [deptrac](https://qossmic.github.io/deptrac/) - [deptracpy](https://patrickkusebauch.github.io/deptracpy/) See you on [Day 16](day16.md).