feat(verticals): distributed-training phases 1,3,4,5 — adapter, market, hierarchy, TEE#9
Merged
Merged
Conversation
… market, hierarchy, TEE binding Phase 1 (autoresearch-training-runtime crate): PrimeCluster (prime, MIT) + PsycheCluster (Psyche, Apache-2.0) implement TrainingCluster behind prime-backend/psyche-backend features (mirrors autoresearch-sandbox-runtime). Pure, unit-tested recipe->config mapping; feature-off train() returns a named EngineError::Backend; feature-on builds the real tokio::process invocation. Phase 3 (training_market.rs): ContinuousTrainingMarket pays marginal held-out loss reduction via settle_record_bounty (frontier bought once); RescorePanel m-of-n referees reject a divergent self-reported score. Phase 4 (hierarchical.rs): HierarchicalCluster<C> composes k inner clusters and is itself a TrainingCluster (nests); models scale bonus net of cross-cluster drift penalty. Phase 5 (tee_cluster.rs): TeeSimCluster<C> + a test driving the real run_private_competitive proving the tier->cluster binding (unsealed engine -> AttestationRequired; sealed clears it). Integration: pub DistributedTrainingScorer::measure (sync scoring core, removes the unsafe-waker shims to keep #![forbid(unsafe_code)]); wire lib.rs exports + workspace member. Full gate green: 242 cargo + 94 forge tests, clippy -D warnings, fmt.
P3 (training_market.rs): the m-of-n panel was only ever tested with unanimous
votes (accepting in {0,n}). Add quorum_accepts_on_a_genuine_split_vote — a claim
that lands inside 2 of 3 referees' CIs (the n=64 referee dissents) so the panel
accepts via m, with a unanimous-panel cross-check proving acceptance is m-gated,
not all-or-nothing. Add majority_rejects_a_near_boundary_cheat (+0.005, ~1.7x the
CI half-width, not the old trivial 160x). Fix two wrong comments: a resubmission
pays zero on its MEASURED held-out marginal (per-seed train noise), not recipe
identity; the per-referee CI half-width is ~0.003, not ~0.01. Drop a tautological
telescoping-sum assertion.
P4 (hierarchical.rs): the scale bonus is a closed-form -k_bonus*ln(k) credit (the
cross-cluster analogue of Phase-0's ISLAND_GAIN) — quality-blind by construction.
Soften the docs that narrated it as emergent 'replicas seeing more data', and fix
the decorrelation test comment, to state honestly that a degenerate hierarchy
earns the same credit and a real backend is what would earn it in substance.
Full gate green: 244 cargo + 94 forge tests, clippy -D warnings, fmt.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Builds out the distributed-training integration on the Phase-0 cluster seam (#8). Full local gate green: 242 cargo + 94 forge tests, clippy -D warnings clean, fmt clean.
Phase 1 —
autoresearch-training-runtimecrate (real adapter, feature-gated)PrimeCluster(prime, MIT) +PsycheCluster(Psyche, Apache-2.0) implementTrainingCluster, mirroring howautoresearch-sandbox-runtimegates the real sandbox backend. Pure unit-testedrecipe → prime/Psyche configmapping; feature-offtrain()returns a namedEngineError::Backend;prime-backend/psyche-backendbuild the realtokio::processinvocation. Code real; execution needs the frameworks + GPUs.Phase 3 — training market (
training_market.rs)ContinuousTrainingMarket: king-of-the-hill leaderboard paying marginal held-out loss reduction viasettle_record_bounty— frontier bought exactly once, non-improving resubmission pays zero.RescorePanel: m-of-n independent referees; majority rejects a divergent self-reported score.Phase 4 — hierarchical cluster (
hierarchical.rs)HierarchicalCluster<C>composes k inner clusters and is itself aTrainingCluster(nests), dropping into the market unchanged. Models the scale bonus net of a cross-cluster drift penalty. Real k-instance run still needs operator infra.Phase 5 — TEE binding (
tee_cluster.rs)TeeSimCluster<C>+ a test driving the realrun_private_competitive: an unsealed training engine fails the tier→cluster binding withAttestationRequired; a sealed one clears it (then fails at the honest structural-attestation seam — the documented §12 gap). Attestation is structural-only (same honest gap as the existing TEE path).Integration notes
DistributedTrainingScorer::measuremadepub(sync scoring core) — removed the agents'unsafeno-op-waker shims to keep#![forbid(unsafe_code)]intact (safestd::task::Wakein the two test helpers).docs/DISTRIBUTED-TRAINING.mdupdated to mark shipped vs infra-gated honestly.Phase 2 (held-out gate in the sibling
training-blueprint) ships separately as tangle-network/training-blueprint#10.