test(client): fix three flaky tests around socket/connect lifecycle#3299
Open
nkaradzhov wants to merge 3 commits into
Open
test(client): fix three flaky tests around socket/connect lifecycle#3299nkaradzhov wants to merge 3 commits into
nkaradzhov wants to merge 3 commits into
Conversation
Attach a permanent error listener before connecting so the timeout event cannot surface as an uncaught error during the handshake or in the idle gap before the test body attached its listener. Drive the test off the first emitted error via a one-shot promise gate instead of a fixed-duration `setTimeout`, and use `reconnectStrategy: false` + `disableClientSetup` to make the run deterministic. Bump the timeout from 50ms to 200ms to give the localhost handshake slack on slow CI.
The test pairs `connectTimeout: 1` with a 1ms event-loop block to force the socket's idle timer to fire. On a fast host the kernel finishes the TCP handshake almost instantly and the two 1ms durations race: the block can exit just as (or before) the timer expires, so the `connect` event is processed before the timeout callback and the connection succeeds. The test then fails with `Missing expected rejection`. Block for 50ms instead so the timer is guaranteed to have expired — and its callback to be queued ahead of the `connect` event — by the time the event loop runs again.
The test forced the subscriber connection down with `CLIENT KILL` and used `once(subscriber, 'error')` to wait for the resulting error. A single forced reconnect can emit `error` more than once — the initial SocketClosedUnexpectedlyError, plus any transient connect error while the server tears the old connection down — and `once` removes its listener after the first emit, so the next one surfaced as "Uncaught Error: Socket closed unexpectedly". Swap in a permanent error listener and use a one-shot promise gate to await the first error deterministically. Subsequent errors during the reconnect cycle are caught (and ignored) by the permanent listener.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three flaky tests in
packages/clientconsolidated into one branch, one commit each:Socket > socketTimeout > should timeout with positive socketTimeout values— listener attachment race let the socket-timeout error surface as uncaught on slow CI.Client > ConnectionTimeoutError—connectTimeout: 1raced a 1ms event-loop block; on fast hardware theconnectevent landed before the timeout callback and the connection succeeded.Client > PubSub > should resubscribe—once(subscriber, 'error')removed itself after the first emit, leaving the second (transient) error during the forced reconnect uncaught.All three are wired up via the same root-cause shape: relying on a one-shot
oncelistener for an event source that can fire more than once. The fix in each case is a permanent listener plus a one-shot promise gate to keep the test deterministic.The
ConnectionTimeoutErrorcase is purely a timing margin bump.Test plan
should timeout with positive socketTimeout values× 10 runs locally — all passConnectionTimeoutError× 5 runs locally — all passshould resubscribe× 10 runs locally — all pass🤖 Generated with Claude Code
Note
Low Risk
Only test files and timing/listener wiring change; no runtime client behavior is modified.
Overview
Stabilizes three flaky client tests around socket and reconnect timing; no production code changes.
PubSub
should resubscribereplacesonce(subscriber, 'error')with a permanenterrorlistener plus a one-shot promise so extra errors during forced reconnect do not become uncaught, while still waiting on the first error.ConnectionTimeoutErrorlengthens the deliberate event-loop block from ~1ms to ~50ms soconnectTimeout: 1reliably wins over a fast TCP connect on quick hosts.socketTimeoutpositive timeout raises the timeout to 200ms, usesdisableClientSetup, connects explicitly withreconnectStrategy: false, and attaches a permanenterrorlistener before connect so timeout/handshake errors stay handled and assertions run deterministically.Reviewed by Cursor Bugbot for commit 09c5416. Bugbot is set up for automated code reviews on this repo. Configure here.