Skip to content

fix: align oss readiness updates#67

Merged
bigint merged 1 commit into
mainfrom
sync/2026-05-29
May 29, 2026
Merged

fix: align oss readiness updates#67
bigint merged 1 commit into
mainfrom
sync/2026-05-29

Conversation

@bigint

@bigint bigint commented May 29, 2026

Copy link
Copy Markdown
Owner

No description provided.

@bigint bigint marked this pull request as ready for review May 29, 2026 15:55
@bigint bigint merged commit e90d3ab into main May 29, 2026
6 checks passed
@bigint bigint deleted the sync/2026-05-29 branch May 29, 2026 16:01
@greptile-apps

greptile-apps Bot commented May 29, 2026

Copy link
Copy Markdown

Greptile Summary

This PR is a broad OSS readiness sweep across the full monorepo — tightening security, correctness, and observability without adding major new features.

  • Backend security & correctness: Adds per-IP-per-email login/setup throttling, normalizes API key scopes to [\"*:*\"] on create/update, extends tenant isolation to document listing/delete/elements, fixes multi-collection query tenant filter mutation, adds idempotency one-time-response semantics for API key creation, and extracts URL credential redaction into a shared safe_url_value helper.
  • Queue & retrieval: Releases the ingestion queue admission slot during maintenance-mode delays; tightens stuck-job recovery to only re-queue pending/processing documents; normalizes RRF scores to [0, 1] for consistent min_score use across search modes.
  • Frontend: Adds QueryError display for previously silent fetch-failure paths, introduces ts-pattern for exhaustive branch matching, and fixes stale collection selection after deletion.

Confidence Score: 4/5

Safe to merge; the two findings are both non-blocking quality issues with no runtime impact on the happy path.

The changes are well-scoped and the critical paths (tenant isolation, auth throttling, idempotency) all look correct. Two issues stand out: audit.py imports a leading-underscore private function across module boundaries, and the new valueForSubmit helper in the frontend silently converts non-numeric input to null via NaN rather than surfacing a validation error client-side. Neither causes data loss or incorrect behavior in production, but both warrant a fix before the pattern spreads.

api/bigrag/services/audit.py (private cross-module import) and app/src/features/settings/instance-settings-helpers.ts (NaN to null coercion in int/float/int_list branches).

Important Files Changed

Filename Overview
api/bigrag/routers/auth.py Adds per-IP-per-email login and setup throttling backed by Redis; rate-limit counters use SHA-256-keyed Redis keys and fall back gracefully when Redis is unavailable.
api/bigrag/middleware/idempotency.py Adds one-time-response semantics for API-key creation and similar endpoints: a completed uncacheable request stores a sentinel so replays return 409 rather than the original plaintext key.
api/bigrag/logging_redaction.py Extracts safe_url_value from request_logging.py into the shared redaction module; now redacts all query-parameter values and is applied to any :// string in redact_log_value.
api/bigrag/services/documents/tenant.py Adds document_tenant_metadata_filter for pre-query tenant scoping on document listing; consistent with existing check_document_tenant enforcement on individual document operations.
api/bigrag/routers/query.py Fixes multi-collection tenant filter isolation: previously a mutable multi_filters was modified per collection causing the last collection's tenant filter to leak; now each collection gets its own filter via filters_by_collection.
api/bigrag/services/retrieval/fusion.py Normalizes RRF scores to [0, 1] by dividing by the theoretical maximum, enabling consistent min_score semantics across semantic and hybrid modes.
api/bigrag/services/audit.py Sanitizes audit log metadata through _safe_metadata before persistence, but imports that function by its private (underscore-prefixed) name from a sibling module.
app/src/features/settings/instance-settings-helpers.ts Adds valueForSubmit to coerce setting values to the correct API type; int/float/int_list branches can produce NaN (serialized as null) for non-numeric input without client-side feedback.
api/bigrag/services/queue_recovery.py Tightens stuck-job recovery: filters the UPDATE to only pending/processing documents and returns only IDs that were actually updated, preventing spurious re-queuing of terminal documents.
api/bigrag/services/queue.py Adds defer_admitted_job/restore_deferred_admission to release and re-acquire the ingestion queue admission slot around maintenance-mode delays, avoiding slot starvation.
app/src/features/collections/document-detail-route.tsx Adds explicit error state rendering for document and chunks queries using QueryError, replacing a bare loading check that silently spun forever on fetch failures.
app/src/features/chat/use-chat-page-controller.ts Extends collection auto-selection to also switch to the first available collection when the current one is no longer in the list, preventing stale selection after deletion.

Comments Outside Diff (1)

  1. app/src/features/settings/instance-settings-helpers.ts, line 1522-1539 (link)

    P2 Non-numeric input silently becomes null in the submitted payload

    Number.parseInt("abc", 10) returns NaN, and Number.parseFloat("abc") likewise. JSON.stringify(NaN) serializes as null, so if a user types a non-numeric string into an int, float, or int_list setting, the submitted value is null (or an array containing null entries) rather than a validation error. The server should reject the payload, but the UX gives no feedback before submission. A .filter(v => !Number.isNaN(v)) on the int_list branch (and an explicit NaN guard on the scalar int/float branches returning null only for empty strings) would prevent the silent corruption.

    Fix in Codex Fix in Claude Code

Fix All in Codex Fix All in Claude Code

Reviews (1): Last reviewed commit: "fix: align oss readiness updates" | Re-trigger Greptile

from bigrag.ids import uuid7
from bigrag.logging import get_logger
from bigrag.models.auth import AuditLogEntry, AuditLogListResponse
from bigrag.services.access_log.payload import _safe_metadata

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Private symbol imported across module boundary

_safe_metadata carries a leading underscore marking it as private to access_log/payload.py. Importing it from outside that module breaks the encapsulation convention and creates a fragile coupling — any internal rename or refactor of the function signature will silently break audit.py. The same pattern exists in access_log/context.py, which is within the same package and more defensible, but audit.py is a peer module. The function should be made public (remove the underscore) and exported via the access_log package's __init__.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Fix in Codex Fix in Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant