Test Selection in Monorepos: Affected Files Only

CI/CD & DevOps Testing 6 min read July 24, 2026

Monorepos have a compounding feedback problem. The codebase grows, the test suite grows, and at some threshold — usually around the 800-service mark or when CI wall time crosses 25 minutes — teams stop running the full suite on every PR. They either start skipping tests informally or they start skipping them formally with a label like skip-ci. Neither is a strategy. Affected-file test selection is.

The core problem is graph traversal: given a set of changed files, determine the minimal set of test targets that must run to preserve confidence. This is not a new idea — Bazel and Buck have done it with build graphs since the early 2010s — but the tooling for language-native test runners (Pytest, Cucumber-JVM 7, Playwright, Cypress 13) has matured enough that most teams can implement a credible version without adopting a full build system.

By the end of this article you'll know how to wire a change-detection layer into your CI pipeline, map file changes to test targets using dependency graphs, and avoid the three failure modes that consistently break this pattern in practice. The approach applies equally to GitHub Actions, Jenkins, and Argo Workflows.

API Testing using Python, Behave, VS Code & GitHub Copilot

Smarter API Test Automation — Python, Behave, VS Code, AI with GitHub Copilot & CI/CD Pipelines. Complete in a Weekend!

Learn more

Affected-Test Selection: The Dependency Graph Problem

Affected-test selection is the practice of computing, at CI trigger time, which test targets have a transitive dependency on any file in the current changeset — and running only those targets. The key word is transitive. A change to a shared utility module in libs/auth must surface tests in every service that imports it, not just the tests co-located with that module. Without transitive analysis, you get false confidence: green CI on a PR that silently broke three downstream consumers.

In a modern test architecture this sits between the VCS trigger and the test executor. The CI runner queries git diff --name-only origin/main, feeds that file list into a dependency resolver, and emits a filtered list of test targets. The resolver can be as simple as a hand-maintained YAML map or as sophisticated as a language-aware import graph built with tools like Nx (for JS/TS monorepos), Pants 2.x, or a custom Python script using ast.parse to walk imports. The output is the same either way: a list of Pytest node IDs, Cucumber tags, or Playwright project names to pass to the test runner.

Building the Change-to-Test Pipeline in Practice

Start with the change set. In GitHub Actions, tj-actions/changed-files gives you a structured output you can consume directly. In Jenkins or Argo, a short shell step suffices:

# In a GitHub Actions step
- name: Get changed files
  id: changes
  uses: tj-actions/changed-files@v44
  with:
    files_separator: ","

From there, feed the file list into a dependency resolver. For a Python monorepo, a script that walks ast.parse import trees and emits Pytest node IDs is around 80 lines. For a TypeScript monorepo using Nx, the graph is already computed — you just call it:

# Python: resolve affected pytest targets
# affected_tests.py
import ast, sys, pathlib, json

IMPORT_MAP = {}  # populated by pre-build graph walk

def resolve(changed_files):
    targets = set()
    for f in changed_files:
        targets.update(IMPORT_MAP.get(f, []))
    return list(targets)

if __name__ == "__main__":
    changed = sys.argv[1].split(",")
    print(json.dumps(resolve(changed)))

# TypeScript / Nx: print affected test projects
npx nx print-affected --type=app --select=projects \
  --base=origin/main --head=HEAD

Wire the resolver output into your test runner. For Pytest, pass the node IDs directly. For Cucumber-JVM 7, use tag expressions derived from a service-to-tag map. For Playwright, pass --project flags. A real example: a platform team at a mid-size fintech mapped 1,400 Pytest tests across 34 services. After implementing this pattern, the median PR CI run dropped from 18 minutes to 4 minutes. The key was including shared fixture files in the dependency graph — a conftest.py change triggers all tests that inherit from it.

# GitHub Actions: run only affected tests
- name: Run affected tests
  run: |
    TARGETS=$(python affected_tests.py "${{ steps.changes.outputs.all_changed_files }}")
    pytest $(echo $TARGETS | jq -r '.[]' | tr '\n' ' ') \
      --tb=short -q

For BDD suites using Behave or Cucumber-JVM 7, tag your feature files by owning service and map changed packages to tags. A change in services/payments/src/processor.py should trigger @payments scenarios, including any cross-cutting @contract scenarios that exercise the payments API boundary. Keep the tag map in a versioned YAML file — it becomes the explicit contract between your repo structure and your test suite.

# test_map.yaml
services/payments:
  tags: ["@payments", "@contract-payments"]
libs/auth:
  tags: ["@auth", "@payments", "@orders", "@notifications"]

Where Affected-Test Pipelines Break Down

The most common failure is an incomplete dependency graph. Teams map source files to tests but omit infrastructure-as-code, Dockerfile changes, and shared test fixtures. A change to a base Docker image or a shared conftest.py silently invalidates the entire graph — tests pass in CI against a stale image and fail in staging. The fix is to include non-source files explicitly: any change to docker/, infra/, or root-level config files should trigger a full suite run as a fallback, not a partial one.

The second failure is graph staleness. The dependency map is built once and committed; then engineers refactor imports, rename modules, or extract shared libraries without updating the map. Six months later the map is wrong and nobody knows it because CI is green. Automate map regeneration as a pre-commit hook or a nightly CI job, and add a lint step that fails if the committed map diverges from the generated one. This is a tooling discipline problem, not a hard engineering problem — it just requires making the wrong state visible.

Myths That Slow Teams Down on Monorepo Test Selection

"We need Bazel to do this properly." Bazel's hermetic build graph is the gold standard, but it carries significant adoption cost — BUILD file maintenance, remote cache infrastructure, and a steep learning curve. For most teams, a language-native import graph plus a YAML service map gets you 80–90% of the benefit at 10% of the operational overhead. Use Bazel when you have a dedicated build-infra team and hundreds of engineers committing daily. Use Nx or Pants 2.x when you want graph-aware selection without rewriting your entire build system. Use a hand-maintained YAML map when your monorepo has fewer than 20 services and changes slowly.

"Affected-only selection means we can skip the nightly full suite." It does not. Affected selection is a PR-time optimization, not a coverage guarantee. Integration drift, environment config changes, and flaky tests that only surface under load all require periodic full-suite runs. Run the full suite on merge to main and on a nightly schedule. Treat the affected-only run as a fast feedback signal, not as the final safety net. Teams that collapse these two jobs into one eventually ship a regression that affected-only selection correctly skipped — because the dependency graph didn't know about a runtime coupling that wasn't visible in the import tree.

Affected-test selection is fundamentally a graph problem bolted onto a CI problem. Get the dependency graph right first — automate its generation, lint for drift, and include non-source files in the fallback rules. Once the graph is trustworthy, the CI wiring is straightforward. The next thing worth measuring after you ship this is false-negative rate: how often does a green affected-only run precede a failure on main? That number tells you where your graph has gaps. OpenTelemetry spans on your test executor, aggregated in Grafana, make that measurement tractable.

Note: This article is for informational purposes only and is not a substitute for professional advice. If you need guidance on specific situations described in this article, consider consulting a qualified professional.

Affected-Test Selection: The Dependency Graph Problem

Building the Change-to-Test Pipeline in Practice

Where Affected-Test Pipelines Break Down

Myths That Slow Teams Down on Monorepo Test Selection

Related Articles

Generate Test Cases with AI in Minutes (Real Framework)

Parallelize Test Execution in GitHub Actions

Shift-Left Testing: What It Actually Means

Auto-Heal vs Auto-Skip: When to Use Each in CI