Skip to content

Add token-protected dashboard snapshots to the CLI server#1026

Open
jethac wants to merge 4 commits into
steipete:mainfrom
jethac:feat/dashboard-server
Open

Add token-protected dashboard snapshots to the CLI server#1026
jethac wants to merge 4 commits into
steipete:mainfrom
jethac:feat/dashboard-server

Conversation

@jethac

@jethac jethac commented May 18, 2026

Copy link
Copy Markdown

Summary

  • Add a typed dashboard snapshot payload for CodexBar provider, session, account, and app state.
  • Extend codexbar serve with a token-protected dashboard snapshot endpoint.
  • Add configurable dashboard identity exposure modes:
    • none: omit account identity fields
    • redacted: redact sensitive identity values while preserving useful account context such as domains/plans
    • full: include full identity values for trusted local/private deployments
  • Add opt-in --dashboard-pairing for keyboard-light clients: the server prints a one-time 6-digit code that is entered on the client to claim a generated bearer token.
  • Document the dashboard API, authentication behavior, CLI flags, identity modes, pairing flow, and example requests.

Rationale

  • This lets CodexBar act as the local source of truth for lightweight dashboard hardware.
  • Low-powered clients can fetch a compact, already-normalized snapshot instead of reimplementing provider probing, account parsing, session tracking, privacy policy, and auth logic themselves.
  • Keeping the aggregation in CodexBar also avoids duplicating sensitive provider/account handling across small devices or local display projects.

Pairing security model (revised per review)

  • The previous three-choice pairing flow was correctly flagged as guessable (review: ~87% success probability within the attempt budget). It has been replaced.
  • The pairing code now never travels over the network. Discovery advertises only auth.type: "code" and codeLength; the 6-digit code is printed exclusively on the server console.
  • Online guessing budget: 5 failed claims lock pairing until the process restarts, so worst-case guess probability is 5 in 1,000,000.
  • A successful claim closes pairing entirely: discovery returns 404, further claims return 404, and only the claimed token authorizes data routes.
  • Pairing tokens never authorize data routes before they are claimed.
  • Pairing remains opt-in via --dashboard-pairing; token-only LAN serving via --dashboard-token is unchanged and remains the default requirement for non-loopback hosts.

Implementation notes

  • Adds dashboard payload models and snapshot aggregation logic.
  • Adds bearer-token authentication for dashboard API requests.
  • Keeps protected dashboard responses unavailable without a valid token.
  • Covers payload construction, identity redaction, routing, auth handling, pairing lifecycle (claim, lockout, closure, pre-claim denial), and endpoint behavior with tests.
  • Uses inert placeholder tokens in docs/examples instead of real or secret-looking credentials.

Live behavior proof (redacted)

  • Built CodexBarCLI from this branch on Ubuntu 24.04 (Swift 6.2.1, same toolchain as CI) and exercised the full pairing lifecycle against the running server.

Server startup:

$ CodexBarCLI serve --host 0.0.0.0 --port 8085 --dashboard-pairing
CodexBar server listening on http://0.0.0.0:8085
Dashboard pairing code: 5•••••

Discovery does not expose the code:

$ curl http://127.0.0.1:8085/dashboard/v1/pairing
{"schemaVersion":1,"service":"codexbar-dashboard","auth":{"expiresInSeconds":0,"type":"code","codeLength":6,"pairingId":"CF23BC8C-…"}}

Wrong code is rejected; unauthorized snapshot is denied:

$ curl -w "\nHTTP %{http_code}\n" ".../pairing/claim?pairingId=CF23BC8C-…&code=000000"
{"error":"invalid pairing code"}
HTTP 401

$ curl -w "\nHTTP %{http_code}\n" http://127.0.0.1:8085/dashboard/v1/snapshot
{"error":"unauthorized"}
HTTP 401

Correct code claims a token, pairing closes, token authorizes:

$ curl ".../pairing/claim?pairingId=CF23BC8C-…&code=5•••••"
{"schemaVersion":1,"endpoint":"/dashboard/v1/snapshot","token":"80f93c5d…[redacted]"}

$ curl -w "\nHTTP %{http_code}\n" http://127.0.0.1:8085/dashboard/v1/pairing
{"error":"pairing unavailable"}
HTTP 404

$ curl -H "Authorization: Bearer 80f93c5d…[redacted]" http://127.0.0.1:8085/dashboard/v1/snapshot
{"schemaVersion":1,"providers":[{"updatedAt":"2026-06-10T01:02:33Z","name":"Codex","id":"codex", … }]}   (HTTP 200)

Server log across the session:

CodexBar server listening on http://0.0.0.0:8085
Dashboard pairing code: 5•••••
Pairing attempt rejected (1/5).
Dashboard paired. Pairing is now closed.

Commands run

  • swift build --product CodexBarCLI (Linux, Swift 6.2.1): builds clean.
  • swift test --parallel (Linux, via CI build-linux-cli jobs on both x64 and arm64): passed, including the new pairing lifecycle tests.
  • macOS lint-build-test CI job: SwiftFormat/SwiftLint clean, build and tests on the latest head.
  • git diff --check upstream/main...HEAD: passed.

Screenshots/GIFs

  • Not applicable; this is CLI/API behavior with no UI changes. Live request/response transcripts are included above.

Reference

  • No linked issue.

Contribution notes

  • No CONTRIBUTING.md or pull request template was found.
  • Followed repository guidance from AGENTS.md.
  • Branch is rebased onto upstream main at a4f278d9.
  • Published branch contains four scoped commits:
    • docs: add dashboard snapshot API reference
    • feat(cli): add dashboard snapshot payload builder
    • feat(cli): serve token-protected dashboard snapshots
    • feat(dashboard): add code-based pairing

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 08a598e122

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +38 to +41
staleAfterSeconds: max(180, Int(refreshInterval.rounded(.up)) * 3),
host: DashboardHostPayload(
codexBarVersion: codexBarVersion,
refreshIntervalSeconds: max(0, Int(refreshInterval.rounded()))),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reject huge refresh intervals before Int conversion

When a user passes a very large but finite --refresh-interval (for example 1e300), decodeServeRefreshInterval accepts it, but the first dashboard snapshot converts it with Int(refreshInterval.rounded(...)); Swift traps on out-of-range Double to Int conversion, killing the server instead of reporting an argument error. Clamp or reject values above the representable range before these conversions.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted and fixed!

@jethac jethac force-pushed the feat/dashboard-server branch 3 times, most recently from e3165ac to babb10f Compare May 18, 2026 16:56
@clawsweeper

clawsweeper Bot commented May 29, 2026

Copy link
Copy Markdown

Codex review: found issues before merge. Reviewed June 9, 2026, 11:10 PM ET / 03:10 UTC.

Summary
The PR adds dashboard snapshot payloads, codexbar serve host/auth/pairing flags and routes, documentation, and CLI-focused tests for a token-protected dashboard API.

Reproducibility: yes. for the review finding: source inspection shows the claim route reads code from the request query string while the docs say the code is never sent over the network. The PR body also demonstrates that pairing is exercised through a claim URL containing the code.

Review metrics: 2 noteworthy metrics.

  • Changed files: 5 added, 6 modified, 0 removed. The PR is a broad feature addition touching server routing, auth, payload models, tests, and docs.
  • Dashboard routes: 3 added. The new snapshot, pairing discovery, and pairing claim routes define a new HTTP API surface that needs security review.

Merge readiness
Overall: 🧂 unranked krab
Proof: 🐚 platinum hermit
Patch quality: 🧂 unranked krab
Result: blocked by patch quality or review findings.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

  • [P2] Fix or restate the pairing-code confidentiality guarantee so it matches the claim route.
  • Update the CLI reference so token and pairing non-loopback modes are documented consistently.
  • Get explicit maintainer acceptance for the trusted-local-HTTP token/pairing boundary.

Risk before merge

  • [P1] The PR exposes provider usage, cost, credits, status, and identity-derived fields over an opt-in HTTP boundary; maintainers need to explicitly accept the no-TLS bearer-token/pairing threat model.
  • [P1] The pairing documentation currently makes a stronger confidentiality claim than the implementation provides because the code is submitted in the GET claim URL.
  • [P1] The CLI docs remain inconsistent about whether pairing can satisfy the non-loopback host requirement.

Maintainer options:

  1. Fix the pairing threat model first
    Update the implementation or documentation so the pairing-code guarantee matches the actual HTTP request path before merge.
  2. Accept trusted-LAN HTTP explicitly
    Maintainers can choose to accept bearer tokens and submitted pairing codes over non-TLS local HTTP if the docs clearly scope this to trusted networks and deployments.
  3. Defer short-code pairing
    Land a token-only dashboard snapshot first and pause pairing until the desired protocol and threat model are settled.

Next step before merge

  • [P2] This needs maintainer security-model review; automation should not choose whether to weaken the guarantee in docs or require a different pairing protocol.

Security
Needs attention: The proposed endpoint adds a security boundary, and the pairing documentation currently promises a confidentiality property the GET claim implementation does not provide.

Review findings

  • [P1] Stop claiming pairing codes stay off the network — docs/dashboard-api.md:19
  • [P3] Document pairing in the CLI reference — docs/cli.md:52-60
Review details

Best possible solution:

Merge only after the pairing threat model is truthful in docs and PR text, or after the implementation changes to satisfy the stronger guarantee, with maintainer acceptance of the new HTTP security boundary.

Do we have a high-confidence way to reproduce the issue?

Yes for the review finding: source inspection shows the claim route reads code from the request query string while the docs say the code is never sent over the network. The PR body also demonstrates that pairing is exercised through a claim URL containing the code.

Is this the best way to solve the issue?

No. The dashboard endpoint direction is plausible, but the current pairing docs and threat model are not the narrowest safe landing path until they match the actual HTTP claim flow and maintainer security sign-off is explicit.

Full review comments:

  • [P1] Stop claiming pairing codes stay off the network — docs/dashboard-api.md:19
    The docs say the pairing code is never sent over the network, but the claim flow sends it as code= in GET /dashboard/v1/pairing/claim and the server reads it from request.queryItems. For non-TLS LAN serving, that puts the user-entered code in the HTTP request URL, so either implement a flow where this guarantee is true or document the actual trusted-local-HTTP model.
    Confidence: 0.94
  • [P3] Document pairing in the CLI reference — docs/cli.md:52-60
    The CLI docs still say non-loopback hosts require --dashboard-token and that LAN Host headers are enabled only when that token is configured, but the implementation accepts --dashboard-pairing as the alternate non-loopback gate. Please update this reference or point readers to the dashboard API pairing section so users do not think pairing cannot be used for LAN dashboards.
    Confidence: 0.87

Overall correctness: patch is incorrect
Overall confidence: 0.88

AGENTS.md: found and applied where relevant.

Codex review notes: model gpt-5.5, reasoning high; reviewed against a4f278d91fe4.

Label changes

Label changes:

  • add proof: sufficient: Contributor real behavior proof is sufficient. The PR body now provides redacted live terminal output for the server startup, pairing lifecycle, unauthorized denial, authorized snapshot, and server logs.
  • add status: ⏳ waiting on author: ClawSweeper has contributor-facing work open and is waiting for author action. Sufficient (terminal): The PR body now provides redacted live terminal output for the server startup, pairing lifecycle, unauthorized denial, authorized snapshot, and server logs.
  • remove status: 📣 needs proof: Current PR status label is status: ⏳ waiting on author.

Label justifications:

  • P1: The PR adds a LAN-capable HTTP API for sensitive provider/account data and still has a security-model blocker in the documented pairing flow.
  • merge-risk: 🚨 security-boundary: Merging would create a new token/pairing-protected boundary for data that previously stayed inside the local CLI/app process.
  • rating: 🧂 unranked krab: Overall readiness is 🧂 unranked krab; proof is 🐚 platinum hermit and patch quality is 🧂 unranked krab.
  • status: ⏳ waiting on author: ClawSweeper has contributor-facing work open and is waiting for author action. Sufficient (terminal): The PR body now provides redacted live terminal output for the server startup, pairing lifecycle, unauthorized denial, authorized snapshot, and server logs.
  • proof: sufficient: Contributor real behavior proof is sufficient. The PR body now provides redacted live terminal output for the server startup, pairing lifecycle, unauthorized denial, authorized snapshot, and server logs.
Evidence reviewed

Security concerns:

  • [medium] Pairing-code confidentiality claim is false — docs/dashboard-api.md:19
    The documentation says the code is never sent over the network, but the implemented claim route receives it from the HTTP query string; on the documented non-TLS local HTTP server this is a visible request credential.
    Confidence: 0.94

What I checked:

  • Repository policy applied: AGENTS.md was read fully; its CLI-focused testing guidance, provider data-siloing guidance, and auth/provider sensitivity notes apply to this review. (AGENTS.md:1, a4f278d91fe4)
  • Vision requires sign-off: VISION.md classifies new features and behavior affecting privacy/auth/data handling as sign-off work, which applies to the new token/pairing HTTP surface. (VISION.md:8, a4f278d91fe4)
  • Claim route accepts the code from the URL query: The proposed pairing claim handler passes request.queryItems["code"] into pairing.claim, so the entered pairing code is part of the HTTP request URL. (Sources/CodexBarCLI/CLIServeCommand.swift:467, 6b3863a26f38)
  • Docs overstate pairing confidentiality: The dashboard API docs say the pairing code is never sent over the network, but the same page documents a GET claim URL containing code=481273. (docs/dashboard-api.md:19, 6b3863a26f38)
  • CLI docs still describe token-only LAN serving: docs/cli.md says non-loopback hosts require --dashboard-token and Host headers are enabled only when that token is configured, while the implementation and help allow --dashboard-pairing as the alternate LAN gate. (docs/cli.md:52, 6b3863a26f38)
  • Real behavior proof updated: The PR body now includes redacted live terminal transcripts for server startup, discovery, wrong-code rejection, unauthorized snapshot denial, successful claim, closed pairing discovery, authorized snapshot, and server logs. (6b3863a26f38)

Likely related people:

  • steipete: Peter Steinberger has the dominant recent path history for CLI serve/docs files and touched host validation, formatting, release normalization, and current main release commits in this area. (role: recent area contributor; confidence: high; commits: 920997c6a365, d8b7619de439, e5eb944d8e88; files: Sources/CodexBarCLI/CLIServeCommand.swift, Sources/CodexBarCLI/CLILocalHTTPServer.swift, docs/cli.md)
  • Thiago Costa: The path history shows Thiago Costa introduced the localhost serve command that this PR extends. (role: introduced behavior; confidence: medium; commits: 74a019c4aa65; files: Sources/CodexBarCLI/CLIServeCommand.swift, Sources/CodexBarCLI/CLILocalHTTPServer.swift)
  • Ellis Nieuwpoort: Ellis Nieuwpoort recently changed the serve request timeout/cache behavior that the dashboard snapshot code builds alongside. (role: recent adjacent contributor; confidence: medium; commits: 06b7de126f1a; files: Sources/CodexBarCLI/CLIServeCommand.swift, Sources/CodexBarCLI/CLILocalHTTPServer.swift)
What the crustacean ranks mean
  • 🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
  • 🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
  • 🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
  • 🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
  • 🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
  • 🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
  • 🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works
  • ClawSweeper keeps one durable marker-backed review comment per issue or PR.
  • Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
  • A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
  • PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
  • Maintainers can also comment @clawsweeper review to request a fresh review only.
  • Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
  • Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
  • Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

@clawsweeper clawsweeper Bot added rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. P1 Urgent regression or broken agent/channel workflow affecting real users now. merge-risk: 🚨 security-boundary 🚨 Merging this PR could weaken sandboxing, authorization, credentials, or sensitive data. labels May 29, 2026
@jethac jethac force-pushed the feat/dashboard-server branch from 950b4e7 to bdaf906 Compare May 29, 2026 00:23
@jethac jethac force-pushed the feat/dashboard-server branch 5 times, most recently from 082038b to c1824a1 Compare June 10, 2026 01:01
Add --dashboard-pairing for keyboard-light dashboard clients. The server
prints a one-time 6-digit code that never travels over the network;
discovery advertises only the code length. A client claims a generated
bearer token by submitting the code. Five failed claims lock pairing
until restart, a successful claim closes pairing, and pairing tokens
never authorize data routes before they are claimed.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@jethac jethac force-pushed the feat/dashboard-server branch from c1824a1 to 6b3863a Compare June 10, 2026 02:22
@clawsweeper clawsweeper Bot added proof: sufficient Contributor real behavior proof is sufficient. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action. and removed status: 📣 needs proof The PR needs real behavior proof before ClawSweeper can clear the contributor ask. labels Jun 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merge-risk: 🚨 security-boundary 🚨 Merging this PR could weaken sandboxing, authorization, credentials, or sensitive data. P1 Urgent regression or broken agent/channel workflow affecting real users now. proof: sufficient Contributor real behavior proof is sufficient. rating: 🧂 unranked krab Not merge-ready due to missing proof or serious correctness/safety concerns. status: ⏳ waiting on author ClawSweeper has contributor-facing work open and is waiting for author action.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant