Skip to content

feat(verticals): distributed-training phases 1,3,4,5 — adapter, market, hierarchy, TEE#9

Merged
drewstone merged 2 commits into
mainfrom
feat/distributed-training-phases
Jun 16, 2026
Merged

feat(verticals): distributed-training phases 1,3,4,5 — adapter, market, hierarchy, TEE#9
drewstone merged 2 commits into
mainfrom
feat/distributed-training-phases

Conversation

@drewstone

Copy link
Copy Markdown
Contributor

Builds out the distributed-training integration on the Phase-0 cluster seam (#8). Full local gate green: 242 cargo + 94 forge tests, clippy -D warnings clean, fmt clean.

Phase 1 — autoresearch-training-runtime crate (real adapter, feature-gated)

PrimeCluster (prime, MIT) + PsycheCluster (Psyche, Apache-2.0) implement TrainingCluster, mirroring how autoresearch-sandbox-runtime gates the real sandbox backend. Pure unit-tested recipe → prime/Psyche config mapping; feature-off train() returns a named EngineError::Backend; prime-backend/psyche-backend build the real tokio::process invocation. Code real; execution needs the frameworks + GPUs.

Phase 3 — training market (training_market.rs)

  • ContinuousTrainingMarket: king-of-the-hill leaderboard paying marginal held-out loss reduction via settle_record_bounty — frontier bought exactly once, non-improving resubmission pays zero.
  • RescorePanel: m-of-n independent referees; majority rejects a divergent self-reported score.

Phase 4 — hierarchical cluster (hierarchical.rs)

HierarchicalCluster<C> composes k inner clusters and is itself a TrainingCluster (nests), dropping into the market unchanged. Models the scale bonus net of a cross-cluster drift penalty. Real k-instance run still needs operator infra.

Phase 5 — TEE binding (tee_cluster.rs)

TeeSimCluster<C> + a test driving the real run_private_competitive: an unsealed training engine fails the tier→cluster binding with AttestationRequired; a sealed one clears it (then fails at the honest structural-attestation seam — the documented §12 gap). Attestation is structural-only (same honest gap as the existing TEE path).

Integration notes

  • DistributedTrainingScorer::measure made pub (sync scoring core) — removed the agents' unsafe no-op-waker shims to keep #![forbid(unsafe_code)] intact (safe std::task::Wake in the two test helpers).
  • docs/DISTRIBUTED-TRAINING.md updated to mark shipped vs infra-gated honestly.

Phase 2 (held-out gate in the sibling training-blueprint) ships separately as tangle-network/training-blueprint#10.

… market, hierarchy, TEE binding

Phase 1 (autoresearch-training-runtime crate): PrimeCluster (prime, MIT) +
PsycheCluster (Psyche, Apache-2.0) implement TrainingCluster behind
prime-backend/psyche-backend features (mirrors autoresearch-sandbox-runtime).
Pure, unit-tested recipe->config mapping; feature-off train() returns a named
EngineError::Backend; feature-on builds the real tokio::process invocation.

Phase 3 (training_market.rs): ContinuousTrainingMarket pays marginal held-out
loss reduction via settle_record_bounty (frontier bought once); RescorePanel
m-of-n referees reject a divergent self-reported score.

Phase 4 (hierarchical.rs): HierarchicalCluster<C> composes k inner clusters and
is itself a TrainingCluster (nests); models scale bonus net of cross-cluster
drift penalty.

Phase 5 (tee_cluster.rs): TeeSimCluster<C> + a test driving the real
run_private_competitive proving the tier->cluster binding (unsealed engine ->
AttestationRequired; sealed clears it).

Integration: pub DistributedTrainingScorer::measure (sync scoring core, removes
the unsafe-waker shims to keep #![forbid(unsafe_code)]); wire lib.rs exports +
workspace member. Full gate green: 242 cargo + 94 forge tests, clippy -D
warnings, fmt.
P3 (training_market.rs): the m-of-n panel was only ever tested with unanimous
votes (accepting in {0,n}). Add quorum_accepts_on_a_genuine_split_vote — a claim
that lands inside 2 of 3 referees' CIs (the n=64 referee dissents) so the panel
accepts via m, with a unanimous-panel cross-check proving acceptance is m-gated,
not all-or-nothing. Add majority_rejects_a_near_boundary_cheat (+0.005, ~1.7x the
CI half-width, not the old trivial 160x). Fix two wrong comments: a resubmission
pays zero on its MEASURED held-out marginal (per-seed train noise), not recipe
identity; the per-referee CI half-width is ~0.003, not ~0.01. Drop a tautological
telescoping-sum assertion.

P4 (hierarchical.rs): the scale bonus is a closed-form -k_bonus*ln(k) credit (the
cross-cluster analogue of Phase-0's ISLAND_GAIN) — quality-blind by construction.
Soften the docs that narrated it as emergent 'replicas seeing more data', and fix
the decorrelation test comment, to state honestly that a degenerate hierarchy
earns the same credit and a real backend is what would earn it in substance.

Full gate green: 244 cargo + 94 forge tests, clippy -D warnings, fmt.
@drewstone drewstone merged commit ade55c5 into main Jun 16, 2026
3 checks passed
@drewstone drewstone deleted the feat/distributed-training-phases branch June 16, 2026 00:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant