TENET — Documentation

Build evals extend the RL improvement loop to greenfield building. Instead of optimizing an existing metric, agents build new modules from specs and iterate until every assertion passes.

Quick Start

The fastest way to create a build agent:

# Generate eval + TOML from a spec file
tenet build --spec knowledge/MY_SPEC.md --name my-feature

# Or from an inline description
tenet build --name auth-module \
  --files src/lib/auth.ts \
  --desc "Create auth module with login(), logout(), session management"

# List all build agents and their scores
tenet build --list

# Run it
tenet build --run my-feature

The Pattern

spec → eval assertions → agent TOML → `tenet build --run {name}` → Karpathy loop → PR

Write a spec describing what to build (or pass inline with --desc)
tenet build generates the eval script with decomposed assertions
tenet build generates the agent TOML config
tenet build --run starts Peter Parker — the agent iterates from 0% → 100%
PR created automatically when score hits 1.0

You can also do steps 1-3 manually for full control. tenet build just automates the proven pattern.

Writing a Build Eval

A build eval is a TypeScript file that checks spec compliance:

// eval/build/storage-adapter.ts
export async function evaluate(): Promise<number> {
  const checks = [
    { name: "interface-exists", pass: existsSync("src/lib/storage/interface.ts") },
    { name: "has-read-method", pass: fileContains("src/lib/storage/interface.ts", "read(") },
    { name: "has-write-method", pass: fileContains("src/lib/storage/interface.ts", "write(") },
    { name: "local-impl", pass: existsSync("src/lib/storage/local.ts") },
    { name: "cloud-impl", pass: existsSync("src/lib/storage/cloud.ts") },
    { name: "compiles", pass: tscPasses() },
  ]
  
  return checks.filter(c => c.pass).length / checks.length
}

Agent TOML Config

[agent]
name = "build-storage-adapter"
scope = "build"
metric = "spec_compliance"
direction = "maximize"
time_budget_seconds = 600

[eval]
script = "eval/build/storage-adapter.ts"
data = "eval/fixtures/build-baseline.jsonl"

[task]
description = """
Create the TenetStorage adapter with interface, 
LocalStorage, and CloudStorage implementations.
"""

Key Insight

“Granularity of feedback determines speed of convergence.”A monolithic eval with 16 checks stalled at 7% for hours. The same eval decomposed into 6 page-level evals — each hit 100% in one round. Same agent, same code, different gradient.

Build vs RL Agents

	RL Agent	Build Agent
Goal	Improve existing metric	Build new code from spec
Baseline	Current score (e.g., 0.43)	Zero (nothing exists)
Rounds	5-50, small changes	3-10, creates files
Worktree	From origin/main	From HEAD (inherits merged work)
Turns	15 per round	40 per round
Early stop	No (keep improving)	Yes (stops at 1.0)

Documentation Index

​Quick Start

​The Pattern

​Writing a Build Eval

​Agent TOML Config

​Key Insight

​Build vs RL Agents

Quick Start

The Pattern

Writing a Build Eval

Agent TOML Config

Key Insight

Build vs RL Agents