fix(server): stop garbled escape sequences leaking into the terminal#3508
fix(server): stop garbled escape sequences leaking into the terminal#3508olafura wants to merge 1 commit into
Conversation
📝 WalkthroughWalkthroughReplaces the single-view terminal escape-sequence sanitizer with ChangesDual-view terminal sanitization
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
a522d32 to
91f9152
Compare
ApprovabilityVerdict: Needs human review 2 blocking correctness issues found. This bug fix introduces substantial new terminal sanitization logic (~550 lines) with complex regex-based escape sequence handling that modifies runtime behavior for terminal input/output processing. Additionally, there are two unresolved medium-severity review comments identifying potential bugs in the sanitization logic that could cause incomplete fixes or unintended side effects. You can customize Macroscope's approvability policy. Learn more. |
|
@coderabbitai review |
✅ Action performedReview finished.
|
91f9152 to
700a3e2
Compare
Dismissing prior approval to re-evaluate 700a3e2
700a3e2 to
70cbad4
Compare
|
Addressed the code-review findings (all in
Validated each fix against the real byte forms and the captured |
70cbad4 to
ab220ef
Compare
|
On the latest round: Cursor — "Flattened garble splits across chunks" (Medium). Confirmed the mechanism: I investigated a cross-chunk carry (hold back a trailing partial token, prepend to the next chunk) and validated it against the ~30 real captured terminal logs. It cleanly handles the common case but is not complete — a split before the first The finding's actual persistence concern — "the same garbled scrollback can return after restore" — is already mitigated by the load-time sanitize in this PR: the split halves land contiguously in the log, and
So the residue is transient live-only (and the input-side strip already prevents the runaway echo loop that previously made it flood en masse), and it does not survive a restart. The three findings from the prior round (DCS DECRPSS not stripped from input, bare DA query over-matched, flattened |
ab220ef to
207f31e
Compare
Self-review (multi-angle, max effort) of
|
| Sev | Finding | Fix |
|---|---|---|
| Critical | ReDoS — the (?:[0-9]+;)+rgb:[0-9a-fA-F/]+ alternative in FLATTENED_FRAGMENT/FLATTENED_REPLY_TOKEN backtracks catastrophically on program-controlled output. Measured ~40s on a 336 KB "<digits>;"-run that never reaches rgb:. It runs per output chunk on both views and on the whole history file at load, so a program printing such a run stalls the server event loop (DoS). |
Pinned the colour alternative to the real OSC numbers — (?:1[012];|4;[0-9]+;)rgb: — removing the unbounded run. 40s → 18ms. Added a ReDoS regression test. |
| High | FLATTENED_REPLY_TOKEN's trailing n? swallowed the next character ("1;2$ynext" → "ext"), corrupting both history and live. |
Removed the n?. |
| High | [01]$r[0-9;]*[a-zA-Z] greedily consumed a following digit/; run of legitimate text ("1$r0;12;34;Hello" ate through H). |
Length-bounded the payload to {0,8}. |
| High | (?:[0-9]+;)+rgb: matched ordinary "<n>;rgb:…" program text (e.g. a CSS/colour dump), deleting it from the live stream. |
Same OSC-number pin as the ReDoS fix. |
| High | The input filter stripped focus events (CSI I / CSI O), so a program that enabled focus reporting (DECSET ?1004 — vim, tmux) never received them. Focus events are user-action-driven and don't feed the redraw→requery loop, so stripping them was pure regression. |
Dropped CSI I/CSI O from INPUT_TERMINAL_RESPONSE; added a test asserting focus events are forwarded. |
| Low | The input OSC/DCS reply patterns rejected the 8-bit ST (0x9c) terminator the output sanitizer accepts, so a 0x9c-terminated auto-reply leaked to the PTY and could re-arm the echo loop. |
Added \x9c to the terminator alternation (and excluded it from the DCS body class). |
Re-validated against ~30 real captured terminal logs: residue still fully stripped, in bounded time. 73 unit tests pass; typecheck + lint clean.
Considered and deliberately not changed
- Live stream runs the flattened-residue heuristic. Applying
stripFlattenedModeReplyResiduetoliveTextis a heuristic over arbitrary program output; the fixes above shrink its false-positive surface to genuinely reply-shaped text. The deeper "strip only at the precise escape layer, since the input-side filter already breaks the loop" argument is valid, but removing the live strip would reintroduce the transient garbage this PR set out to remove. Kept it — happy to revisit if reviewers prefer the narrower approach. - Bracketed-paste corruption. The input filter has no
ESC[200~/ESC[201~awareness, so pasted text containing escape sequences can be edited mid-paste. Real but needs a paste-mode state machine — out of scope here. - Two-
c-token run bridging ("1;2c 3;4c"→ stripped): tightening risks under-stripping real DA-reply runs, and that literal shape in plain text is rare. Left as-is. - Pre-existing efficiency (O(n²) history concat, dual-buffer build for identical views, per-chunk closures): not introduced by this diff; left alone.
No CLAUDE.md/AGENTS.md convention violations found.
207f31e to
200b323
Compare
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using high effort and found 3 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 200b323. Configure here.
200b323 to
2fd9f6e
Compare
Addressed the latest review round (
|
Returning to a terminal — and live use — leaked garbage like "69;0$y2026;2$y", "1;2c", "11;rgb:…" onto the screen (web and TUI both render the server's sanitized stream), and a prompt that re-queries on redraw could amplify it into a runaway flood. Reported in pingdotgg#1238. Fixed from both directions: - Input: the browser emulator auto-answers the program's capability queries (DECRPM/DA/DSR/OSC-colour) and emits focus events, sending them as PTY input; at an idle prompt the shell echoes them and the loop runs away. Strip that whole terminal→host response class from client input at the source (terminal.write) so it never reaches the shell. Cursor-position reports and bare query forms are kept (programs block on those). - Output / scrollback: one single-pass sanitizer (sanitizeTerminalChunkDual) emits both the scrollback view (drops queries AND responses, so a replay can't re-trigger an echo) and the live view (drops only responses, relays queries). Covers CSI (DECRQM "$p"/DECRPM "$y", DA, DSR, CPR, 8-bit C1), OSC 10/11/12 colour, and DCS (DECRQSS/DECRPSS), leaving sixel/DECUDK alone. - History on load: readHistory now sanitizes the persisted log (older builds wrote it raw), and also drops the *flattened* residue a shell echoes once the ESC introducer is gone — runs of DECRPM "$y" / DA "<m>;<v>c" / OSC colour (incl. OSC 4 palette), plus those distinctive tokens when isolated — which the escape-aware strip can't see. Ambiguous lone tokens and ordinary words are preserved. Validated against a real 1.5 GB dataset (970 KB polluted log: $y→0, rgb: 9070→<50, prompt text intact). Tests cover every class for both views, 8-bit C1, split-across-chunks, within-chunk divergence, the input strip, the flattened residue, and load-time sanitize of a raw pingdotgg#1238-residue log. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
103cdbf to
cc465ea
Compare
| function shouldStripOscSequence(content: string, responsesOnly = false): boolean { | ||
| // OSC 10/11/12 colour: `rgb:…` is a response; `?` is a query. | ||
| return responsesOnly ? /^(10|11|12);rgb:/.test(content) : /^(10|11|12);(?:\?|rgb:)/.test(content); | ||
| } |
There was a problem hiding this comment.
🟡 Medium terminal/Manager.ts:914
shouldStripOscSequence only matches OSC 10/11/12, so OSC 4;<idx>;? palette queries leak into history scrollback. The code elsewhere strips OSC 4;<idx>;rgb:… replies from client input, so replaying saved history containing an OSC 4 query causes the emulator to re-answer and reintroduce garbled prompt output.
- // OSC 10/11/12 colour: `rgb:…` is a response; `?` is a query.
- return responsesOnly ? /^(10|11|12);rgb:/.test(content) : /^(10|11|12);(?:\?|rgb:)/.test(content);
+ // OSC 4/10/11/12 colour: `rgb:…` is a response; `?` is a query.
+ return responsesOnly
+ ? /^(4;\d+|10|11|12);rgb:/.test(content)
+ : /^(4;\d+|10|11|12);(?:\?|rgb:)/.test(content);🤖 Copy this AI Prompt to have your agent fix this:
In file @apps/server/src/terminal/Manager.ts around lines 914-917:
`shouldStripOscSequence` only matches OSC `10`/`11`/`12`, so OSC `4;<idx>;?` palette queries leak into history scrollback. The code elsewhere strips OSC `4;<idx>;rgb:…` replies from client input, so replaying saved history containing an OSC 4 query causes the emulator to re-answer and reintroduce garbled prompt output.
| // lone `<n>;<n>R` is too ambiguous to drop on its own. | ||
| const FLATTENED_CPR = "[0-9]*;[0-9]+R+"; | ||
| const FLATTENED_FRAGMENT = `(?:[0-9]+;[0-9]+\\$y|[0-9]+;[0-9]+c|${FLATTENED_CPR}|${FLATTENED_OSC_COLOUR}|${FLATTENED_DECRPSS})`; | ||
| const FLATTENED_REPLY_RUN = new RegExp( |
There was a problem hiding this comment.
🟡 Medium terminal/Manager.ts:1007
FLATTENED_REPLY_RUN uses [\x07\r n] as a separator, where the literal space allows matching across ordinary text like "see 1;2c 2026;2$y now". This incorrectly deletes the entire span as terminal residue, even though 1;2c alone is ambiguous and the comment says such lone tokens should be preserved. Consider removing the literal space from the separator class.
🤖 Copy this AI Prompt to have your agent fix this:
In file @apps/server/src/terminal/Manager.ts around line 1007:
`FLATTENED_REPLY_RUN` uses `[\x07\r n]` as a separator, where the literal space allows matching across ordinary text like `"see 1;2c 2026;2$y now"`. This incorrectly deletes the entire span as terminal residue, even though `1;2c` alone is ambiguous and the comment says such lone tokens should be preserved. Consider removing the literal space from the separator class.

What Changed
The server's terminal output now filters out the invisible terminal "control codes" that were leaking onto the screen as garbage (e.g.
2026;2$y2027;0$y2031;0$y2048;0$y1$r0m,1;2c,11;rgb:…) when you reopen or restore a terminal.It handles the three families these codes come in — CSI (
…$y/…$p), OSC (colourrgb:/?), and DCS ($r/$q, the1$r0mpiece in #1238) — always stripping the machine answers, stripping the questions from saved scrollback, and still passing the questions through during live use so the terminal can keep negotiating features normally. Both views are produced in a single parse of each output chunk.Why
Programs and the terminal constantly exchange invisible control codes. Some are questions the program asks ("do you support this mode?", "where is the cursor?"); the terminal replies with an answer that is meant to be machine-readable, not shown.
The bug: the server was saving those questions into the terminal's scrollback history. When you reopened or restored a terminal (toggling it with Cmd+J, switching projects — see the repro in #1238), the history was replayed, the questions got asked again, and the answers got printed as visible junk at the shell prompt.
The fix removes the questions from saved history (so a replay can't trigger fresh answers) and removes stray answers from the live stream (normal program output never contains them). The web app and the TUI both render this same server-sanitized stream, so this fixes it in both.
UI Changes
No UI code changed — this is a server-side output filter. The visible symptom and a video are in #1238; the exact gibberish string from that report is now covered by a regression test.
Checklist
Fixes the bug reported in #1238.
🤖 Generated with Claude Code
Note
Medium Risk
Changes core terminal byte handling for all sessions (history replay, live attach, and client→PTY writes); wrong stripping could break vim/tmux or drop legitimate output, though behavior is heavily tested.
Overview
Fixes #1238 by filtering terminal capability traffic so reopening or restoring a session no longer replays junk like
2026;2$y…at the shell prompt.Output path:
sanitizeTerminalChunkDualparses each PTY chunk once and builds two views—scrollback drops both host queries and terminal responses; the live stream drops only responses so the client emulator can still answer queries. Persisted history is sanitized on load (and on legacy migration), rewriting dirty logs from older builds. A flattened-residue pass strips echo fragments when the ESC introducer is already gone.Input path:
stripTerminalResponsesFromInputremoves browser auto-replies (DECRPM, DA, CPR, OSC/DCS colour, etc.) beforewritereaches the PTY, breaking the echo/re-query feedback loop; focus events and real keystrokes are kept.Most of the diff is regression tests for CSI/OSC/DCS, 8-bit C1, chunk boundaries, and ReDoS safety.
Reviewed by Cursor Bugbot for commit cc465ea. Bugbot is set up for automated code reviews on this repo. Configure here.
Note
Strip garbled terminal escape sequences from PTY output, history, and client input
sanitizeTerminalChunkDual) that produces both aliveText(responses stripped, queries kept) andhistoryText(both stripped) in a single pass over each PTY output chunk.stripTerminalResponsesFromInputto filter auto-generated terminal replies (CPR, DA, DSR, OSC color, DECRPSS) out of client writes before they reach the PTY; writes are skipped entirely if nothing remains.stripFlattenedModeReplyResidueto remove reply fragments that appear as plain text at prompts (no ESC introducer), covering patterns like2026;2$yandrgb:…runs.liveTextinstead of raw data; history logs are rewritten on load; client input containing only terminal responses produces no PTY write.Macroscope summarized cc465ea.
Summary by CodeRabbit
Tests
Refactor