The RL Loop
1. State (World Model)
Before each round, TENET captures the current system state:- Composite eval score
- Test pass rate and coverage
- Build health
- Code quality metrics
- Agent’s trajectory (what it tried before)
2. Action (Agent)
The agent makes a focused code change. The policy head helps select what type of change to try based on what worked in the past.3. Eval (Measure)
An eval script runs against the agent’s changes — not the main branch. TheAGENT_WORKTREE mechanism ensures the eval tests the actual changes in an isolated git worktree.
4. Reward (Keep or Revert)
- Score improved → change is kept, merged to the session branch
- Score stayed same or regressed →
git reset --hard HEAD~1, change reverted