Documentation Index
Fetch the complete documentation index at: https://docs.10et.ai/llms.txt
Use this file to discover all available pages before exploring further.
TENET’s core innovation is a reinforcement learning loop for code. Agents don’t just make changes — they measure whether changes improved the codebase, keep what works, and learn from the results.
The RL Loop
State (world model) → Action (agent makes change) → Eval (measure result) → Keep or Revert → Training Buffer → Policy Head (learn what works) → back to State
1. State (World Model)
Before each round, TENET captures the current system state:
- Composite eval score
- Test pass rate and coverage
- Build health
- Code quality metrics
- Agent’s trajectory (what it tried before)
2. Action (Agent)
The agent makes a focused code change. The policy head helps select what type of change to try based on what worked in the past.
3. Eval (Measure)
An eval script runs against the agent’s changes — not the main branch. The AGENT_WORKTREE mechanism ensures the eval tests the actual changes in an isolated git worktree.
4. Reward (Keep or Revert)
- Score improved → change is kept, merged to the session branch
- Score stayed same or regressed →
git reset --hard HEAD~1, change reverted
5. Training Buffer
Every round — kept or reverted — writes a training tuple:
{
"agent": "test-coverage",
"state": { "composite_score": 0.1276, ... },
"action": { "type": "add_tests", "description": "...", "files_affected": [...] },
"reward": { "composite_delta": 0.0031, "improved": true }
}
6. Policy Head
A 14M-parameter transformer trained on the training buffer. Predicts which actions will produce positive reward given the current state. Retrained nightly when 50+ new tuples accumulate.
The Nightly Loop
Every night at 2 AM (configurable):
tenet peter daily
+-- Mine training tuples from journals
+-- Synthesize product context
+-- Strategic reasoning (which agents to run?)
+-- Run stale agents (5 rounds each)
+-- Retrain policy head (if 50+ new tuples)
+-- Pick up backlog issues → create PRs
Self-Driving Pipeline
Issues flow through a kanban pipeline automatically:
Issue filed (Linear/GitHub)
→ tenet/backlog label
→ PP picks up (every 30 min)
→ Agent creates PR
→ CI runs eval
→ Score improves → auto-merge → close issue
→ Score regresses → request changes
Key Insight
The eval script is the reward function. If the eval measures the right thing, agents improve. If it doesn’t, they waste compute.
We learned this the hard way — 750 rounds with 2.5% keep rate because eval scripts were at ceiling (test pass rate was already 100%). The fix: eval scripts that measure metrics with real gradient (actual coverage percentage, not just pass/fail).