iTestBDD

Test Selection in Monorepos: Affected Files Only

In the world of software development, monorepos have become a popular choice due to their ability to consolidate multiple projects into a single repository. With platforms like Google and Facebook championing this approach, it is clear that monorepos provide significant advantages in managing code at scale. However, one of the challenges in monorepo environments is optimizing the testing process, particularly in CI/CD pipelines. Running all tests for every change is often impractical and inefficient. This article addresses the technical problem of test selection in monorepos, specifically focusing on affected files only.

By the end of this article, you will be equipped with techniques to implement efficient test selection in your monorepo-based projects. You will understand how to configure your CI/CD pipelines to run tests associated with only the affected files, thus saving time and computational resources.

The need for this approach has become more pressing as monorepos grow in size and complexity. Recent advancements in tooling, such as improvements in GitHub Actions and the introduction of AI tools for code analysis, have provided new opportunities to tackle this challenge effectively.

What This Actually Is

Test selection in monorepos, focusing on affected files only, is a strategy designed to optimize the testing process by executing only the tests relevant to the changes made. This approach leverages the power of dependency analysis and change detection to identify which parts of the codebase are impacted by a commit or pull request.

In a modern test architecture, this strategy fits as a crucial component of the CI/CD pipeline. By limiting test execution to affected areas, it significantly reduces the time and resources required to validate changes, speeding up the feedback loop for developers.

Tools like Bazel, Lerna, and changesets in combination with CI/CD orchestrators such as Jenkins or GitHub Actions, can be configured to detect impacted tests intelligently. These tools analyze the dependency graph of the codebase and determine the minimal set of tests required to validate changes, ensuring both efficiency and reliability.

How To Implement It

Implementing test selection in monorepos requires a combination of tool configuration and strategic test design. Let's walk through the process using a TypeScript project managed with Lerna and tested with Jest.

First, ensure that your project is structured to support dependency analysis. Lerna, for instance, provides the capability to manage dependencies and run scripts across packages. Configure Lerna to manage your monorepo:

{
  "lerna": "^4.0.0",
  "scripts": {
    "test": "lerna run test --since"
  }
}

This setup allows Lerna to run tests only on packages that have changed since the last commit. The --since flag is critical here as it looks at the git history to determine which packages have been affected.

Next, configure Jest to work with this setup. Jest supports running tests based on changed files by integrating with Git:

{
  "scripts": {
    "test": "jest --findRelatedTests"
  }
}

The --findRelatedTests flag ensures that Jest executes only the tests related to the modified files. This not only reduces the run time but also focuses the testing effort where it's most needed.

Finally, integrate this setup into your CI/CD pipeline. If you're using GitHub Actions, a workflow might look like this:

name: CI
on: [push, pull_request]
jobs:
  build:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v2
    - name: Install dependencies
      run: npm install
    - name: Run tests
      run: npm test

By implementing these steps, the test execution time in a typical medium-sized monorepo can drop from around 20 minutes to just 5 minutes, depending on the scope of changes.

Common Pitfalls

One common pitfall is failing to accurately identify dependencies between packages in a monorepo. This can lead to either over-testing or missing critical tests. Ensure that your dependency graph is correctly maintained and updated as the project evolves to avoid this issue.

Another mistake is not configuring your CI/CD pipeline to handle edge cases such as new files or deleted files. These scenarios can break the dependency analysis if not properly managed. Ensure your scripts account for these changes and that your CI/CD tools support dynamic file detection.

Lastly, overlooking the integration with version control systems can lead to inaccurate test selection. Ensure that your tools are properly configured to work with git or any other VCS in use. This includes setting up correct branch configurations and ensuring the CI/CD system accurately reflects the state of the repository.

What Most Teams Get Wrong

Many teams mistakenly believe that achieving 100% test coverage is the ultimate goal. In reality, the focus should be on meaningful coverage that targets risk areas and critical paths in the codebase. Use test selection to prioritize these areas instead of blindly aiming for full coverage.

The test pyramid is often misunderstood as a rigid rule rather than a guideline. In a monorepo setup, the pyramid should be adapted to reflect the project's architecture and dependencies, allowing for flexibility in test distribution.

Finally, there's a misconception that manual QA can be entirely replaced by automated tests. While automation is crucial, manual testing still plays a vital role in exploratory testing and scenarios that are difficult to automate. Balance both strategies to ensure comprehensive test coverage.

Optimizing test selection for affected files in monorepos is a powerful way to improve CI/CD efficiency. If you implement this approach, consider monitoring the mean-time-to-detect flaky tests next. For further reading, delve into the documentation of Lerna and Jest to explore additional optimization techniques.

Note: This article is for informational purposes only and is not a substitute for professional advice. If you need guidance on specific situations described in this article, consider consulting a qualified professional.

Understanding how systems actually work is the first step toward navigating them effectively.

Browse all articles