The Missing Metric in Software Testing: Spec Coverage

By Ben Houston, 2025-04-27

For decades, we’ve measured test quality by how much code it touches. If a line of code runs during a test, we say it’s “covered.”

But as we shift toward intent-based programming -- where code is generated from declarative specifications -- this model breaks down. Why?

Because we no longer care how much of the code was exercised. We care how much of the intent was verified.

Enter: Spec Coverage.

What Is Spec Coverage?

Spec Coverage measures the percentage of declared application intent that has been verified by tests.

It’s not about lines of code. It’s about purpose.

# Example intent spec
feature: user-authentication
  intent:
    - Users must log in with email/password
    - Passwords must meet complexity rules
    - Sessions expire after 24 hours
    - All endpoints require authentication

A test suite has spec coverage if it verifies each of these declared behaviors -- regardless of how the underlying code is structured.

Toward a Two-Phase Testing Process

Intent-based programming already separates specification from code generation. We should do the same for tests.

Here’s the improved workflow:

Phase 1: Generate a Declarative Test Plan

From the application spec, we generate a test plan in YAML. Each test case references an intent item and includes a human-readable description:

testPlan:
  - id: auth-001
    specRef: user-authentication/0
    description: 'Verify users can log in with valid email and password'
  - id: auth-002
    specRef: user-authentication/1
    description: 'Reject passwords that fail complexity rules'
  - id: auth-003
    specRef: user-authentication/2
    description: 'Ensure session expires after 24 hours of inactivity'

These test plans are durable, portable, and reviewable -- even before code exists. They’re the source of testing truth.

Phase 2: Generate Test Code from the Plan

The second step generates implementation code for each test case, ensuring that the test suite maps back to explicit intent:

// test/auth-001.test.ts
// linked to spec: user-authentication/0
it('auth-001: users can log in with valid email and password', async () => {
  const res = await login({
    email: 'test@example.com',
    password: 'StrongPass123!'
  });
  expect(res.status).toBe(200);
});

You now have a full trace:
Intent → Test Plan → Test Code → System Verification

Benefits of the Two-Stage Model

Declarative Traceability: Each test links back to its originating spec item, creating a provable audit trail.
Human-in-the-Loop Planning: Test plans are editable before code is generated, allowing engineers, QA, and even product managers to verify correctness.
Easy Regeneration: Changing the spec only regenerates affected test plans and implementations. No brittle coupling to source code.
LLM-Friendly: Language models can more easily generate test code when given clear, structured test plans as intermediate artifacts.

Rethinking Coverage

Under this model, we can define Spec Coverage like this:

specCoverage:
  totalSpecItems: 12
  totalTestPlanItems: 10
  totalTested: 9
  gaps:
    - specItem: user-authentication/1
      reason: no test yet

Forget whether a line of code was hit. Ask instead: Was each intent validated?

This is especially critical in AI-generated systems where:

Implementation is a black box
Code can be re-rendered anytime
Intent must be the invariant

Why Code Coverage Becomes a Liability

Relying on line-level code coverage in this new world is like checking whether your compiler touched every register. It’s irrelevant.

What you really want to know is:

Did we test everything we said this system is supposed to do?

That’s spec coverage.

The Future of Test Suites

This two-phase model transforms test suites into spec executors:

Specs define what the system should do.
Test plans enumerate the validations.
Test code operationalizes each test.
Spec coverage becomes the go-to metric.

We’re moving from “did the code run?” to “did the system fulfill its intent?”

Final Thoughts

Spec-first code generation changes how we write applications. Spec-first test generation will change how we verify them.

By separating test planning from test implementation, and anchoring everything in declarative specs, we unlock a more robust, traceable, and efficient way to test software.

Spec Coverage isn’t just a better metric -- it’s a better model.