fix(grok): Harden ACP resume/completion and replay segment ordering by mwolson · Pull Request #3156 · pingdotgg/t3code

mwolson · 2026-06-19T03:13:19Z

Summary

Race session/prompt against xAI _x.ai/session/prompt_complete so Grok turns can finish when the standard ACP RPC stays stranded.
Harden Grok session resume: drop replay chunks, race session/load against a replay-idle waiter (instead of waiting 90s for a hung RPC or falling back to session/new), defer turn.started until prompt time, and keep late session/update chunks on the completed turn via streamingTurnId.
Fix replayed assistant segment ids sorting above the user message that started the turn; throttle streaming markdown repaints to one per animation frame.

This is a narrow bridge fix for Grok Composer 2.5 Fast hangs while the sturdier ACP/orchestrator work in #2829 continues.

Problem and Fix

Problem and Why it Happened	Fix
Grok can emit `_x.ai/session/prompt_complete` and return the session to idle while the standard `session/prompt` RPC remains stranded. T3 Code waits for `session/prompt` before emitting `turn.completed`, so the composer can stay stuck on Stop even though the turn is done.	Register a per-session fallback before each ACP prompt and race the standard prompt RPC against `_x.ai/session/prompt_complete`, synthesizing the normal ACP `PromptResponse` shape when the xAI notification wins. Route xAI notifications through `handleExtNotification` so they do not clobber the single-slot unknown-notification handler.
Grok adapter session snapshots could retain `activeTurnId` after the prompt settled. Consumers reading adapter session state could still see the session as active.	Mark Grok sessions `running` while prompt work is active and restore `ready` while clearing `activeTurnId` when the final prompt in the turn settles (including user Stop via `interruptTurn` with `settleAllPrompts`).
The existing tests did not reproduce a Grok prompt that completes through xAI's private notification while the prompt RPC never returns.	Extend the ACP mock agent with a prompt-complete-then-hang mode and cover it at both `AcpSessionRuntime` and `GrokAdapter` levels.
Grok `session/load` replay marks historical `session/update` chunks with `_meta.isReplay: true`, but T3 Code projected them into the live turn and flooded the composer with old tool calls.	Drop replay notifications in `AcpSessionRuntime` before parsing or enqueueing parsed session events.
`session/load` can hang after replay finishes because Grok never returns the load RPC response (observed ~90s on a real long session). Falling back to `session/new` discards the native session and orphans pending user messages.	Race `session/load` against a replay-idle waiter: after replay `session/update` traffic goes quiet for a configurable gap (default 2s), synthesize a load response from the initialize metadata and proceed. Still accept a normal RPC response when Grok returns one (~1–2s in probes and in production retest). Fail fast if neither path completes within the load timeout.
`turn.started` fired during prompt preparation while replay chunks were still draining, so resumed history attached to the new turn id.	Emit `turn.started` immediately before `session/prompt` instead of during model/prompt preparation.
xAI `prompt_complete` can settle the prompt RPC before trailing `session/update` chunks reach the adapter, so `turn.completed` (and the notify bell) can fire before the response text is visible.	Yield the scheduler cooperatively before emitting `turn.completed` so trailing chunks can land first (best-effort until the broader ACP refactor).
Grok reuses assistant segment ids across turns while projection keeps the replayed `createdAt`, so timeline and feed sort responses above the user message that started the turn. Tools and final prose look missing below the prompt.	Centralize assistant segment updates in `@t3tools/shared/orchestrationMessages`: sort timeline/feed by stable `createdAt`, archive prior-turn rows when `turnId` changes, reset text/timeline anchors on rebound, and repoint checkpoints/`latestTurn.assistantMessageId`.
Grok emits thousands of tiny `agent_message_chunk` events per response; full `ReactMarkdown` repaints on every chunk batch into visible jumps near completion.	Throttle streaming markdown display to one repaint per animation frame via `useRafThrottledValue` in `ChatMarkdown`.
Trailing chunks from a settled turn could bind to the next turn when `streamingTurnId` was cleared at new-turn preparation time.	Track `streamingTurnId` on the adapter and prefer it when resolving notification turn ids; clear it when a new non-steer turn starts streaming.
User Stop left Grok sessions stuck in `running` because `interruptTurn` never settled the in-flight prompt.	Call `settlePromptInFlight` with `cancelled` after `session/cancel`; track `interruptedTurnIds` so late prompt RPCs cannot resurrect cancelled turns.
Multiple pending prompt fallbacks on one session are unlikely in the current composer flow but possible through steering paths.	Resolve fallback completions FIFO by session; when `promptId` is omitted and multiple prompts are in flight, settle the oldest pending prompt.

Defensive Fixes

Problem and Why it Happened	Fix
Grok extension notifications can be batched before the normal `session/prompt` response.	Add effect-acp client coverage that keeps the standard prompt response routable after Grok extension notifications in the same input batch.
Late `session/update` chunks after a turn settles could bind to the next turn when `activeTurnId` already moved forward.	Track `streamingTurnId` on the adapter and prefer it when resolving notification turn ids until the next turn starts.
Failed `session/prompt` RPCs could leave the session stuck in `running`.	Settle prompt in flight with a failed `turn.completed` and restore the session to `ready`.
First rebound assistant delta arrives with `turnId: null` before the provider knows the turn id.	Treat null→non-null turn binding as continuation, not a segment turn change, so the first bound chunk is not dropped. Reset stale replay text only when the completed row already has a known `turnId`.
Reused segment ids on update could leave stale attachments or append into the wrong row.	Remove and reappend on update; clear attachments on `turnChanged` when omitted; archive + reappend on `turnChanged` before cap.

Production Retest (2026-06-25)

Validated on integration AppImage against Grok session 019efa67-b48b-7022-b4c0-0cba45dfa83d / T3 thread 0337b709-9f12-4a3f-8c2d-aa49de240e4c:

After restart, session/load succeeded in ~1.34s (normal Grok RPC; idle synthetic path not needed this run).
Pending "yeah let's do that" dispatched; turn completed (~68s, tool-heavy) despite session/prompt logging Interrupt (expected prompt_complete path).
Follow-up user message completed cleanly (~4.5s).
T3 DB: session ready, 5 completed turns, 0 streaming orphans. Grok native session turnCount: 5.

Known Limitations

During the brief session/load gate, non-replay session/update notifications are held back from the runtime event queue so replay traffic cannot flood the composer. Live tool/status updates during that window are dropped by design; the gate clears once load completes.
Removing session/new fallback on resume failure is intentional: creating a fresh ACP session would orphan the native Grok session and pending user messages. Load now fails fast with a clear transport error instead.

Validation

vp test apps/server/src/provider/Layers/GrokAdapter.test.ts apps/server/src/provider/acp/AcpJsonRpcConnection.test.ts apps/server/src/provider/acp/AcpRuntimeModel.test.ts packages/effect-acp/src/client.test.ts packages/shared/src/orchestrationMessages.test.ts apps/web/src/session-logic.test.ts apps/web/src/hooks/useRafThrottledValue.test.ts
vp run typecheck
vp check
Branch tip cab3352a9

Note

High Risk
Touches Grok session lifecycle, prompt completion, resume/load, and shared message projection used on server and client; incorrect settling or archiving could mis-order threads or strand turns.

Overview
Hardens Grok Composer integration so turns finish, resumes do not flood the UI, and chat history stays in the right order when the provider reuses assistant segment ids.

ACP runtime and Grok adapter: session/prompt is raced against _x.ai/session/prompt_complete when the standard RPC hangs; prompts are serialized; replay session/update chunks (_meta.isReplay) are ignored on resume. session/load races the RPC against a replay-idle waiter (synthetic response after quiet period) instead of falling back to session/new. The Grok adapter tracks streamingTurnId and interruptedTurnIds, settles prompts into ready with turn.completed, and defers turn.started until prompt time.

Orchestration: New @t3tools/shared/orchestrationMessages centralizes assistant segment updates—archive prior-turn rows on rebind, ignore late chunks, repoint checkpoints/latestTurn—used by the SQL projector, in-memory projector, and client threadReducer. Timeline sorting uses assistant createdAt so replayed segments do not jump above the user message.

Web: useRafThrottledValue limits streaming markdown to one repaint per frame; sidebar shows Working while hasPendingLocalDispatch after send.

The ACP mock agent gains env flags for load failures, replay, prompt hangs, and stale/out-of-order xAI completions for tests.

^{Reviewed by Cursor Bugbot for commit 5d3c301. Bugbot is set up for automated code reviews on this repo. Configure here.}

Note

Harden Grok ACP session resume, prompt completion fallback, and assistant segment replay ordering

ACP prompt completion: The runtime now races the session/prompt RPC against an _x.ai/session/prompt_complete notification; if the RPC hangs, the xAI notification resolves the turn. Stale completions (already-seen promptIds) are ignored.
ACP session resume: session/load is now raced against a replay-idle detector in AcpSessionRuntime.ts; session/update replay notifications are suppressed while the load gate is active.
Assistant segment ordering: A new applyAssistantSegmentMessageUpdate orchestrator in orchestrationMessages.ts archives prior-turn assistant rows on segment rebind, filters late streaming updates for completed segments, and repoints checkpoints/latestTurn to archived message ids.
GrokAdapter turn settlement: GrokAdapter.ts introduces settlePromptInFlight, streamingTurnId, and interruptedTurnIds to ensure turns settle exactly once, late notifications attach to the correct turn, and interrupts cancel in-flight prompts cleanly.
Sidebar dispatch pending: The UI now shows a 'Working' pulse on threads where a local composer dispatch is in flight, even before the server session transitions to running.
RAF-throttled markdown: ChatMarkdown.tsx throttles re-renders to one per animation frame while streaming.
Timeline sort fix: Assistant messages sort by createdAt instead of updatedAt to prevent streaming bumps from reordering them.

^{Macroscope summarized 5d3c301.}

coderabbitai · 2026-06-19T03:13:29Z

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 8f11b5a6-f1b0-4e12-848e-1baf06745961

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

macroscopeapp · 2026-06-19T03:22:26Z

Approvability

Verdict: Needs human review

4 blocking correctness issues found. This PR introduces substantial new runtime behavior for Grok session resume/completion and replay segment ordering across multiple layers (adapter, projector, client reducer, UI). Three unresolved review comments identify potential bugs in the archiving and message update logic that could affect data integrity.

^{You can customize Macroscope's approvability policy. Learn more.}

macroscopeapp · 2026-06-25T01:53:27Z

+  return !assistantSegmentTurnChanged(existing, input.incoming);
+}
+
+export function archivedAssistantSegmentMessageId(


🟡 Medium src/orchestrationMessages.ts:57

archivedAssistantSegmentMessageId assigns the same synthetic id suffix @turn:replay to every archived message with turnId === null. When the same messageId completes multiple replay-to-live cycles, each archived replay row gets the same id, creating duplicates in messages and making those rows indistinguishable. Consider incorporating a counter or timestamp to generate unique archived ids for null-turn rebinds.

🤖 Copy this AI Prompt to have your agent fix this:

In file @packages/shared/src/orchestrationMessages.ts around line 57: `archivedAssistantSegmentMessageId` assigns the same synthetic id suffix `@turn:replay` to every archived message with `turnId === null`. When the same `messageId` completes multiple replay-to-live cycles, each archived replay row gets the same id, creating duplicates in `messages` and making those rows indistinguishable. Consider incorporating a counter or timestamp to generate unique archived ids for null-turn rebinds.

macroscopeapp · 2026-06-25T02:34:02Z

+          payload.turnId !== null &&
+          thread.session?.status === "running" &&
+          thread.session.activeTurnId === payload.turnId;
+        const applied = applyAssistantSegmentMessageUpdate(thread.messages, message, {


🟡 Medium orchestration/projector.ts:421

When applyAssistantSegmentMessageUpdate() archives a rebound assistant message, the new code repoints checkpoints and latestTurn to the archived message ID before capping messages with slice(-MAX_THREAD_MESSAGES). If the archived message was already the oldest retained row, the cap drops it while checkpoints still reference it. This breaks checkpoint-to-message linking — after replay, checkpoints[*].assistantMessageId can reference a message ID that no longer exists in thread.messages.

Consider capping messages before repointing checkpoints and latestTurn, so the archived message is only referenced if it survives the cap.

🤖 Copy this AI Prompt to have your agent fix this:

In file @apps/server/src/orchestration/projector.ts around line 421: When `applyAssistantSegmentMessageUpdate()` archives a rebound assistant message, the new code repoints `checkpoints` and `latestTurn` to the archived message ID *before* capping `messages` with `slice(-MAX_THREAD_MESSAGES)`. If the archived message was already the oldest retained row, the cap drops it while checkpoints still reference it. This breaks checkpoint-to-message linking — after replay, `checkpoints[*].assistantMessageId` can reference a message ID that no longer exists in `thread.messages`. Consider capping `messages` before repointing checkpoints and latestTurn, so the archived message is only referenced if it survives the cap.

cursor

Cursor Bugbot has reviewed your changes using high effort and found 1 potential issue.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit a8e7713. Configure here.}

Race session/load against replay-idle synthesis so Grok resume no longer falls through to session/new and orphan pending messages. Keep the prompt_complete fallback, streamingTurnId late-chunk binding, and replay segment projection fixes. Assistant segment rebinding now archives prior-turn rows instead of overwriting them, clears inherited attachments on turn change, reappends rebound messages before the thread cap, resets stale replay text on the first null-turn chunk, preserves text on empty completions, and sorts timeline rows by stable createdAt anchors.

macroscopeapp · 2026-06-25T02:56:06Z

+/**
+ * Grok can deliver a stale assistant segment for an older turn after the live
+ * provider message id has already advanced to a newer turn.
+ */
+export function isLateAssistantSegmentFromPriorTurn(input: {
+  readonly existing: AssistantSegmentMessage | undefined;
+  readonly incoming: AssistantSegmentMessage;
+  readonly providerMessageId?: string;
+  readonly archivedTurnIds?: ReadonlySet<string>;
+}): boolean {
+  if (input.incoming.role !== "assistant" || input.existing?.role !== "assistant") {
+    return false;
+  }
+  if (input.incoming.turnId === null || input.existing.turnId === null) {
+    return false;
+  }
+  if (input.existing.turnId === input.incoming.turnId) {
+    return false;
+  }
+  if (!input.incoming.streaming) {
+    return true;
+  }
+  const providerMessageId = input.providerMessageId;
+  if (providerMessageId === undefined) {
+    return false;
+  }
+  const archivedTurnIds = input.archivedTurnIds;
+  if (archivedTurnIds !== undefined) {
+    return archivedTurnIds.has(input.incoming.turnId);
+  }
+  return false;
+}


🟡 Medium src/orchestrationMessages.ts:57

isLateAssistantSegmentFromPriorTurn() returns true for every non-streaming assistant update whose turnId differs from the existing row, causing applyAssistantSegmentMessageUpdate() to drop the update entirely. When a provider reuses the same segment id for a new turn and delivers it as a completed message (streaming: false), the new response is silently discarded and the thread keeps showing the previous turn's text.

if (input.incoming.turnId === null || input.existing.turnId === null) { return false; } if (input.existing.turnId === input.incoming.turnId) { return false; } - if (!input.incoming.streaming) { - return true; - } const providerMessageId = input.providerMessageId; if (providerMessageId === undefined) { return false; }

Also found in 1 other location(s)

apps/server/src/orchestration/Layers/ProjectionPipeline.ts:912

applyThreadMessagesProjection only archives the previous assistant row when turnChanged is true. For the rebound case covered by assistantSegmentStreamingTextResets—a completed assistant message with a known turnId followed by a new streaming chunk with the same messageId but turnId: null—assistantSegmentTurnChanged returns false, so lines 912-946 never archive the old row. The next upsert then overwrites the prior turn's completed assistant message with the new live segment, losing that turn's projected message history until a full rebuild (and even bootstrap will replay the same overwrite).

🤖 Copy this AI Prompt to have your agent fix this:

In file @packages/shared/src/orchestrationMessages.ts around lines 57-88: `isLateAssistantSegmentFromPriorTurn()` returns `true` for every non-streaming assistant update whose `turnId` differs from the existing row, causing `applyAssistantSegmentMessageUpdate()` to drop the update entirely. When a provider reuses the same segment `id` for a new turn and delivers it as a completed message (`streaming: false`), the new response is silently discarded and the thread keeps showing the previous turn's text. Also found in 1 other location(s): - apps/server/src/orchestration/Layers/ProjectionPipeline.ts:912 -- `applyThreadMessagesProjection` only archives the previous assistant row when `turnChanged` is true. For the rebound case covered by `assistantSegmentStreamingTextResets`—a completed assistant message with a known `turnId` followed by a new streaming chunk with the same `messageId` but `turnId: null`—`assistantSegmentTurnChanged` returns `false`, so lines 912-946 never archive the old row. The next upsert then overwrites the prior turn's completed assistant message with the new live segment, losing that turn's projected message history until a full rebuild (and even bootstrap will replay the same overwrite).

github-actions Bot added vouch:unvouched PR author is not yet trusted in the VOUCHED list. size:L 100-499 changed lines (additions + deletions). labels Jun 19, 2026

mwolson marked this pull request as ready for review June 19, 2026 03:20

cursor Bot reviewed Jun 19, 2026

View reviewed changes

Comment thread apps/server/src/provider/Layers/GrokAdapter.ts

Comment thread apps/server/src/provider/acp/AcpSessionRuntime.ts Outdated

mwolson force-pushed the fix/grok-prompt-complete-fallback branch 4 times, most recently from 59e347f to 7ac3370 Compare June 19, 2026 03:44

cursor Bot reviewed Jun 19, 2026

View reviewed changes

Comment thread apps/server/src/provider/Layers/GrokAdapter.ts Outdated

macroscopeapp Bot reviewed Jun 19, 2026

View reviewed changes

Comment thread apps/server/src/provider/Layers/GrokAdapter.ts Outdated

mwolson force-pushed the fix/grok-prompt-complete-fallback branch 3 times, most recently from 1b2c893 to 3c38b27 Compare June 21, 2026 20:26

cursor Bot reviewed Jun 21, 2026

View reviewed changes

Comment thread apps/server/src/provider/Layers/GrokAdapter.ts

mwolson force-pushed the fix/grok-prompt-complete-fallback branch from 2037ac6 to 3859afb Compare June 21, 2026 21:01

cursor Bot reviewed Jun 21, 2026

View reviewed changes

Comment thread apps/server/src/provider/Layers/GrokAdapter.ts

mwolson force-pushed the fix/grok-prompt-complete-fallback branch 2 times, most recently from b6f559c to 17d881a Compare June 21, 2026 21:36

github-actions Bot added size:XXL 1,000+ changed lines (additions + deletions). and removed size:L 100-499 changed lines (additions + deletions). labels Jun 21, 2026

mwolson force-pushed the fix/grok-prompt-complete-fallback branch from 17d881a to ba54adf Compare June 21, 2026 21:37

cursor Bot reviewed Jun 21, 2026

View reviewed changes

Comment thread apps/server/src/provider/Layers/GrokAdapter.ts

mwolson force-pushed the fix/grok-prompt-complete-fallback branch from ba54adf to a2aaf61 Compare June 21, 2026 21:50

github-actions Bot added size:L 100-499 changed lines (additions + deletions). and removed size:XXL 1,000+ changed lines (additions + deletions). labels Jun 21, 2026

mwolson force-pushed the fix/grok-prompt-complete-fallback branch from a2aaf61 to 2a9ced0 Compare June 21, 2026 21:54

cursor Bot reviewed Jun 21, 2026

View reviewed changes

Comment thread apps/server/src/provider/Layers/GrokAdapter.ts Outdated

mwolson force-pushed the fix/grok-prompt-complete-fallback branch from 2a9ced0 to 1ce246d Compare June 21, 2026 23:04

mwolson force-pushed the fix/grok-prompt-complete-fallback branch 3 times, most recently from ee37626 to 3acc12b Compare June 25, 2026 01:08