Documentation Index
Fetch the complete documentation index at: https://docs.10et.ai/llms.txt
Use this file to discover all available pages before exploring further.
Build evals extend the RL improvement loop to greenfield building. Instead of optimizing an existing metric, agents build new modules from specs and iterate until every assertion passes.
Quick Start
The fastest way to create a build agent:
# Generate eval + TOML from a spec file
tenet build --spec knowledge/MY_SPEC.md --name my-feature
# Or from an inline description
tenet build --name auth-module \
--files src/lib/auth.ts \
--desc "Create auth module with login(), logout(), session management"
# List all build agents and their scores
tenet build --list
# Run it
tenet build --run my-feature
The Pattern
spec → eval assertions → agent TOML → `tenet build --run {name}` → Karpathy loop → PR
- Write a spec describing what to build (or pass inline with
--desc)
tenet build generates the eval script with decomposed assertions
tenet build generates the agent TOML config
tenet build --run starts Peter Parker — the agent iterates from 0% → 100%
- PR created automatically when score hits 1.0
You can also do steps 1-3 manually for full control. tenet build just automates the proven pattern.
Writing a Build Eval
A build eval is a TypeScript file that checks spec compliance:
// eval/build/storage-adapter.ts
export async function evaluate(): Promise<number> {
const checks = [
{ name: "interface-exists", pass: existsSync("src/lib/storage/interface.ts") },
{ name: "has-read-method", pass: fileContains("src/lib/storage/interface.ts", "read(") },
{ name: "has-write-method", pass: fileContains("src/lib/storage/interface.ts", "write(") },
{ name: "local-impl", pass: existsSync("src/lib/storage/local.ts") },
{ name: "cloud-impl", pass: existsSync("src/lib/storage/cloud.ts") },
{ name: "compiles", pass: tscPasses() },
]
return checks.filter(c => c.pass).length / checks.length
}
Agent TOML Config
[agent]
name = "build-storage-adapter"
scope = "build"
metric = "spec_compliance"
direction = "maximize"
time_budget_seconds = 600
[eval]
script = "eval/build/storage-adapter.ts"
data = "eval/fixtures/build-baseline.jsonl"
[task]
description = """
Create the TenetStorage adapter with interface,
LocalStorage, and CloudStorage implementations.
"""
Key Insight
“Granularity of feedback determines speed of convergence.”A monolithic eval with 16 checks stalled at 7% for hours. The same eval decomposed into 6 page-level evals — each hit 100% in one round. Same agent, same code, different gradient.
Build vs RL Agents
| RL Agent | Build Agent |
|---|
| Goal | Improve existing metric | Build new code from spec |
| Baseline | Current score (e.g., 0.43) | Zero (nothing exists) |
| Rounds | 5-50, small changes | 3-10, creates files |
| Worktree | From origin/main | From HEAD (inherits merged work) |
| Turns | 15 per round | 40 per round |
| Early stop | No (keep improving) | Yes (stops at 1.0) |