Skip to content

Add opt-in rolling window auto-start#1481

Open
Quicksaver wants to merge 22 commits into
steipete:mainfrom
Quicksaver:feature/reset-rolling-windows
Open

Add opt-in rolling window auto-start#1481
Quicksaver wants to merge 22 commits into
steipete:mainfrom
Quicksaver:feature/reset-rolling-windows

Conversation

@Quicksaver

@Quicksaver Quicksaver commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds an opt-in Auto-start rolling window setting for Codex and Claude.

After a previously active five-hour window expires, a refresh that confirms no active replacement can send one tiny prompt through the matching provider CLI, then refresh usage again.

Safety boundaries

  • Disabled by default.
  • Supports Codex and Claude only.
  • Requires both the previous and current usage snapshots to come from matching CLI-backed sources.
  • Routes Codex through the selected managed account environment instead of the ambient account.
  • Selects Claude's five-hour window explicitly rather than treating a weekly window as the session window.
  • Sends at most one prompt per expired reset timestamp and account route, while allowing distinct managed accounts to proceed independently.
  • Captures the selected Codex account once so dedupe identity and CLI environment cannot diverge during the async ping.
  • Clears the matching account attempt marker after failure.
  • Uses Claude print mode with --no-session-persistence.
  • Does not support OpenCode because its displayed usage source cannot be proven to match the CLI credentials used for a prompt.

Commands

  • Codex: codex exec --skip-git-repo-check -m gpt-5.4-mini -c model_reasoning_effort=low
  • Claude: claude -p --no-session-persistence --model haiku

The prompt, timeout, binary, model, and reasoning level remain locally overridable through the documented CODEXBAR_ROLLING_WINDOW_<PROVIDER>_* environment keys.

Validation

  • swift test --filter RollingWindowAutoStartTests — 16 tests passed
  • make check — passed, 0 lint violations
  • claude --help — confirms --no-session-persistence is supported only with print mode
  • codex exec --help — confirms --skip-git-repo-check
  • Local delta autoreview — clean, no accepted/actionable findings, correctness confidence 0.86
  • Full branch autoreview against origin/main — clean, no accepted/actionable findings, correctness confidence 0.78
  • Current head — 5bd575468051b8905af48958bea172295e2b807d
  • Full CI — must be green on the final rebased head

Merge status

Prepared, not approved for merge. This intentionally consumes provider quota in the background and still needs maintainer product/auth sign-off plus redacted real-provider behavior proof from the current head.

@clawsweeper

clawsweeper Bot commented Jun 13, 2026

Copy link
Copy Markdown

Codex review: needs real behavior proof before merge. Reviewed June 14, 2026, 12:06 PM ET / 16:06 UTC.

Summary
The PR adds an opt-in Codex and Claude rolling-window auto-start setting that can send a tiny provider CLI prompt after an expired five-hour window is detected.

Reproducibility: not applicable. this is a new opt-in feature rather than a broken current behavior. I did not run live provider validation because AGENTS.md cautions against unrequested live provider or Keychain-prompting checks.

Review metrics: 3 noteworthy metrics.

  • PR Surface: 15 files, +2320/-59. The change spans settings UI, persisted config, refresh scheduling, subprocess execution, logging, and tests.
  • Authenticated Prompt Paths: 2 added. Both Codex and Claude gain background CLI prompt paths that can consume provider quota when enabled.
  • Persisted Config Fields: 1 added. The new per-provider flag is disabled by default but still changes the stored provider configuration model.

Merge readiness
Overall: 🧂 unranked krab
Proof: 🧂 unranked krab
Patch quality: 🐚 platinum hermit
Result: blocked until real behavior proof is added.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • [P1] Add redacted live Codex or Claude proof from the current head showing the intended account and a new rolling window.
  • Resolve the current-main conflict and refresh validation on the final head.
  • Get explicit maintainer product/auth sign-off for opt-in background quota-consuming prompts.

Proof guidance:

  • [P1] Needs real behavior proof before merge: The PR body and comments list tests/help checks but no redacted live Codex or Claude auto-start proof from the current head; contributor should add redacted logs, terminal output, screenshot/video, or copied live output and update the PR body for re-review.

Risk before merge

  • [P1] GitHub reports the PR as conflicting/dirty against main, so final review needs the actual resolved merge result.
  • [P1] No redacted real Codex or Claude provider proof from the current head is posted; tests and CLI help checks do not prove the live credential, quota, account-routing, and rolling-window path.
  • [P1] The feature intentionally adds opt-in background authenticated provider CLI prompts that can consume quota or start sessions, so maintainers still need to approve the product/auth tradeoff before release.

Maintainer options:

  1. Hold For Proof And Auth Sign-Off (recommended)
    Require a clean current-main rebase, redacted real-provider proof from the final head, and explicit maintainer product/auth approval before merge.
  2. Accept The Opt-In Quota Tradeoff
    After proof, maintainers may intentionally accept the disabled-by-default background prompt behavior with the current account/source gates.
  3. Pause Background Prompting
    If core should not initiate authenticated provider prompts in the background, pause or close this PR instead of shipping the setting.

Next step before merge

  • [P1] Human handling is needed for conflict resolution, real-provider proof, and maintainer product/auth sign-off; no narrow automated repair is identified yet.

Security
Cleared: No concrete supply-chain or security-boundary defect was found; the intentional authenticated CLI prompt behavior is tracked as an auth-provider merge risk.

Review details

Best possible solution:

Land only after conflict resolution, redacted live proof for the intended provider account path, green final-head validation, and explicit maintainer approval for opt-in background quota-consuming prompts.

Do we have a high-confidence way to reproduce the issue?

Not applicable: this is a new opt-in feature rather than a broken current behavior. I did not run live provider validation because AGENTS.md cautions against unrequested live provider or Keychain-prompting checks.

Is this the best way to solve the issue?

Unclear until maintainer product/auth approval and live proof are available; the patch uses existing provider refresh and account-routing seams with focused tests, but it intentionally starts provider windows in the background.

AGENTS.md: found and applied where relevant.

Codex review notes: model internal, reasoning high; reviewed against 06770308831e.

Label changes

Label justifications:

  • P2: This is a normal-priority opt-in feature with limited default blast radius but meaningful provider-auth and quota implications.
  • merge-risk: 🚨 auth-provider: Merging adds background authenticated Codex/Claude CLI prompts whose credential routing and quota effects require proof and maintainer approval.
  • rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🧂 unranked krab and patch quality is 🐚 platinum hermit.
  • status: 📣 needs proof: The PR needs real behavior proof before ClawSweeper can clear the contributor ask. Needs real behavior proof before merge: The PR body and comments list tests/help checks but no redacted live Codex or Claude auto-start proof from the current head; contributor should add redacted logs, terminal output, screenshot/video, or copied live output and update the PR body for re-review.
Evidence reviewed

What I checked:

  • AGENTS.md policy read: Repository policy was read fully and applied, especially the guidance to avoid unrequested live provider/Keychain validation and to keep provider data/auth routing siloed. (AGENTS.md:1, 06770308831e)
  • Current main lacks this feature: Focused source search found no RollingWindowAutoStart, rollingWindowAutoStart, Auto-start rolling window, or CODEXBAR_ROLLING_WINDOW symbols on current main. (06770308831e)
  • PR merge and proof state: GitHub reports head 677b5dd as mergeable CONFLICTING / mergeStateStatus DIRTY, and the PR body says it still needs maintainer product/auth sign-off plus redacted real-provider proof. (677b5dd22420)
  • Provider gating and command implementation: The branch limits support to Codex and Claude, gates source labels, selects Claude's five-hour window, and builds the Codex/Claude CLI ping commands with local environment overrides. (Sources/CodexBar/RollingWindowAutoStart.swift:8, 677b5dd22420)
  • Account-scoped scheduler path: The scheduler resolves the active Codex route, keys in-flight/attempted state by route, builds the ping environment through ProviderRegistry.makeEnvironment, and forces a provider refresh after a successful ping. (Sources/CodexBar/UsageStore+RollingWindowAutoStart.swift:31, 677b5dd22420)
  • Focused regression coverage: Added tests cover default disabled persistence, managed Codex account environment routing, per-account dedupe, selected token-account skips, uncoalesced post-ping refresh, and failure retry bookkeeping. (Tests/CodexBarTests/RollingWindowAutoStartTests.swift:617, 677b5dd22420)

Likely related people:

  • steipete: Local blame points the current provider refresh/config surface to the current base commit, and the PR branch includes steipete fixes for source gating, Claude session artifacts, and account-scoped Codex routing. (role: recent area contributor and PR follow-up owner; confidence: high; commits: 9d3c354df9ba, 097989a4acf7, 61e66dc3897c; files: Sources/CodexBar/UsageStore+Refresh.swift, Sources/CodexBar/UsageStore+TokenAccounts.swift, Sources/CodexBar/UsageStore+RollingWindowAutoStart.swift)
  • LeoLin990405: Recent Codex WHAM/Spark and Claude OAuth work touches the same provider quota/auth surfaces this feature extends. (role: recent adjacent Codex/Claude provider contributor; confidence: medium; commits: afe4e802239f, 0a0e8cabc7ed; files: Sources/CodexBarCore/Providers/Codex, Sources/CodexBarCore/Providers/Claude)
  • Yuxin Qiao: Recent Claude CLI subscription usage handling is adjacent to the Claude print-mode prompt path and session-window behavior this PR adds. (role: historical Claude CLI/session contributor; confidence: medium; commits: 126dc9238715; files: Sources/CodexBarCore/Providers/Claude, Tests/CodexBarTests/TTYIntegrationTests.swift)
  • soumikbhatta: Recent Codex web credits and OAuth source-routing work is adjacent to the Codex account/source checks needed by this PR. (role: recent adjacent Codex auth/provider contributor; confidence: medium; commits: 96745231187f, 2c8d83cb8128; files: Sources/CodexBarCore/Providers/Codex, Sources/CodexBar/UsageStore+OpenAIWeb.swift)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b3b30e47b7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread Sources/CodexBar/UsageStore+RollingWindowAutoStart.swift Outdated
@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P2 Normal priority bug or improvement with limited blast radius. merge-risk: 🚨 auth-provider 🚨 Merging this PR could break OAuth, tokens, provider routing, model choice, or credentials. labels Jun 13, 2026
Quicksaver and others added 10 commits June 13, 2026 15:41
- Guard stale refresh generations before scheduling pings
- Add scheduler isolation, retry, and focused tests
- Replace the default ping text with a clearer stop instruction\n- Keep the Codex rolling-window test aligned with the new prompt
- Build the ping environment with ProviderRegistry.makeEnvironment
- Assert managed Codex pings inherit the selected home path
@steipete steipete force-pushed the feature/reset-rolling-windows branch from 8437fd7 to 61e66dc Compare June 13, 2026 14:46
@steipete steipete changed the title Auto-start rolling window Add opt-in rolling window auto-start Jun 13, 2026

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 61e66dc389

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread Sources/CodexBar/UsageStore+RollingWindowAutoStart.swift Outdated
@steipete

Copy link
Copy Markdown
Owner

Current head 5bd575468051b8905af48958bea172295e2b807d addresses both review blockers:

  • Codex pings use the selected managed-account environment.
  • Attempt and in-flight state are account-route scoped, including concurrent managed-account transitions with identical reset timestamps.
  • The selected Codex source is captured once for both route identity and CLI environment.

Proof:

  • swift test --filter RollingWindowAutoStartTests: 16 tests passed.
  • make check: passed with 0 format/lint violations.
  • Local delta autoreview: clean, confidence 0.86.
  • Full branch autoreview against origin/main: clean, confidence 0.78.
  • Exact-head CI run: https://github.com/steipete/CodexBar/actions/runs/27470939307 (running).

This remains prepared but not approved for merge. It intentionally consumes provider quota and still needs maintainer product/auth sign-off plus redacted real-provider proof from the current head.

@clawsweeper re-review

@clawsweeper

clawsweeper Bot commented Jun 13, 2026

Copy link
Copy Markdown

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5bd5754680

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

environment: environment,
runner: runner)
self.rollingWindowAutoStartStatus[provider] = "Ping prompt sent."
await self.refreshProvider(provider, coalesceIfRefreshing: true)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Run a real refresh after the ping

When this task is launched from the refresh that detected the expired window, that same refresh generation can still be present in providerRefreshTasks until its async cleanup runs. If the ping command returns quickly (for example with an overridden binary/prompt or a cached local CLI response), coalesceIfRefreshing: true only waits on that already-scheduled generation and then returns, so no post-ping fetch happens and the menu can keep showing the expired/no-window state until the next timer. Start an uncoalesced refresh after the ping, or wait for the scheduling refresh to be removed before coalescing.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid, the post-ping refresh was coalescing with a still-registered triggering refresh generation. I changed it to call refreshProvider(provider) uncoalesced so a successful ping always schedules a fresh provider fetch.

Added a test with a lingering completed refresh state in providerRefreshTasks; it verifies the scheduler still performs a new refresh after the ping.

- Start a new refresh after a successful ping
- Cover the lingering refresh task race in tests
- Drop ephemeral mode from Codex rolling pings
- Assert the Codex ping command keeps session output
- Log each auto-start attempt with route metadata
- Log provider responses and verification outcome
- Include stable timestamps for inactive-window tracing
- Redact managed account route labels in auto-start logs

- Cover route labels and timestamp metadata formatting
- Allow openai-web and claude web or oauth routing checks\n- Block auto-start when the prompt account cannot be verified\n- Add coverage for scheduler routing and exhausted quota windows
- Schedule dashboard-owned Codex rolling window pings

- Ignore Claude model-specific exhaustion for Haiku pings
- Add precise skip messages for routing block reasons
- Document session-window handling and fail closed for unsupported routes
- Cover OpenAI web dashboard overwrite behavior
- Treat text-only rolling windows as active when appropriate
- Track one-shot auto-start attempts without cached resets
- Add coverage for dashboard and scheduler verification paths
- Treat expired reset timestamps as inactive

- Restrict no-history starts to OpenAI web snapshots
- Track reset source explicitly in auto-start logs

- Clear stale route retry state on source changes and verified starts
- Keep stale refreshed resets from clearing retry state

- Cover expired refreshed reset verification
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-risk: 🚨 auth-provider 🚨 Merging this PR could break OAuth, tokens, provider routing, model choice, or credentials. P2 Normal priority bug or improvement with limited blast radius. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants