Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.10et.ai/llms.txt

Use this file to discover all available pages before exploring further.

Build evals extend the RL improvement loop to greenfield building. Instead of optimizing an existing metric, agents build new modules from specs and iterate until every assertion passes.

Quick Start

The fastest way to create a build agent:
# Generate eval + TOML from a spec file
tenet build --spec knowledge/MY_SPEC.md --name my-feature

# Or from an inline description
tenet build --name auth-module \
  --files src/lib/auth.ts \
  --desc "Create auth module with login(), logout(), session management"

# List all build agents and their scores
tenet build --list

# Run it
tenet build --run my-feature

The Pattern

spec → eval assertions → agent TOML → `tenet build --run {name}` → Karpathy loop → PR
  1. Write a spec describing what to build (or pass inline with --desc)
  2. tenet build generates the eval script with decomposed assertions
  3. tenet build generates the agent TOML config
  4. tenet build --run starts Peter Parker — the agent iterates from 0% → 100%
  5. PR created automatically when score hits 1.0
You can also do steps 1-3 manually for full control. tenet build just automates the proven pattern.

Writing a Build Eval

A build eval is a TypeScript file that checks spec compliance:
// eval/build/storage-adapter.ts
export async function evaluate(): Promise<number> {
  const checks = [
    { name: "interface-exists", pass: existsSync("src/lib/storage/interface.ts") },
    { name: "has-read-method", pass: fileContains("src/lib/storage/interface.ts", "read(") },
    { name: "has-write-method", pass: fileContains("src/lib/storage/interface.ts", "write(") },
    { name: "local-impl", pass: existsSync("src/lib/storage/local.ts") },
    { name: "cloud-impl", pass: existsSync("src/lib/storage/cloud.ts") },
    { name: "compiles", pass: tscPasses() },
  ]
  
  return checks.filter(c => c.pass).length / checks.length
}

Agent TOML Config

[agent]
name = "build-storage-adapter"
scope = "build"
metric = "spec_compliance"
direction = "maximize"
time_budget_seconds = 600

[eval]
script = "eval/build/storage-adapter.ts"
data = "eval/fixtures/build-baseline.jsonl"

[task]
description = """
Create the TenetStorage adapter with interface, 
LocalStorage, and CloudStorage implementations.
"""

Key Insight

“Granularity of feedback determines speed of convergence.”A monolithic eval with 16 checks stalled at 7% for hours. The same eval decomposed into 6 page-level evals — each hit 100% in one round. Same agent, same code, different gradient.

Build vs RL Agents

RL AgentBuild Agent
GoalImprove existing metricBuild new code from spec
BaselineCurrent score (e.g., 0.43)Zero (nothing exists)
Rounds5-50, small changes3-10, creates files
WorktreeFrom origin/mainFrom HEAD (inherits merged work)
Turns15 per round40 per round
Early stopNo (keep improving)Yes (stops at 1.0)