.jfl/training-buffer.jsonl) captures every agent action and its outcome. This data trains the policy head and provides experiment history for future runs.
Format
Each line is a training tuple:Data Sources
Tuples come from three sources:| Source | When | What |
|---|---|---|
| Agent runs | Each round | State, action, reward from eval delta |
| Tuple miner | Nightly pre-flight | Extracts tuples from journal entries |
| Manual | jfl_training_buffer tool | Record observations during sessions |