Skip to content

perf(index): cache small-index posting, doc, and vector-probe reads#7258

Closed
hamersaw wants to merge 1 commit into
lance-format:mainfrom
hamersaw:perf/wal-cache-indicies
Closed

perf(index): cache small-index posting, doc, and vector-probe reads#7258
hamersaw wants to merge 1 commit into
lance-format:mainfrom
hamersaw:perf/wal-cache-indicies

Conversation

@hamersaw

Copy link
Copy Markdown
Contributor

Summary

Querying a small full-text index (e.g. a mem_wal flushed-generation index) re-paid object-store IO on every query for metadata that is never cached. Three uncached read paths, each fixed for small indexes while leaving large-index behavior unchanged:

  1. Posting metadataposting_len_for_token / posting_metadata_for_token issued one tiny single-row read_range against the posting file per term per partition, never cached. For a small index (≤256Ki tokens) bulk-load the whole posting metadata once into the existing OnceCell (small_index_bulk_metadata); large indexes keep the O(1) single-row path.
  2. Doc row-idsDeferredDocSet::resolve_row_ids did targeted (uncached) row-id reads on every query. For a small partition (≤256Ki docs) load and cache the whole ROW_ID column instead.
  3. Vector-index probe — the "is this a vector index?" file-existence check for indexes without files metadata issued a HEAD per generic open. Memoize per uuid in the session index cache (IsVectorIndexProbeKey).

Changes

  • lance-index/.../inverted/index.rs: small_index_bulk_metadata + updated test_bm25_stats_for_terms_is_lazy.
  • lance-index/.../inverted/lazy_docset.rs: small-partition row-id column caching in resolve_row_ids.
  • lance/src/index.rs: memoized is_vector_index existence probe.
  • lance/src/session/index_caches.rs: IsVectorIndexProbe cache key.

Validation

cargo test -p lance-index test_bm25_stats_for_terms_is_lazy passes (asserts the first stats lookup issues exactly one posting read and subsequent lookups issue none). Validated end-to-end against a WAL FTS benchmark: a warm query dropped from re-reading per-term posting offsets + doc row-ids + the vector probe each query to zero such reads (all served from cache).

🤖 Generated with Claude Code

Querying a small FTS index (e.g. a mem_wal flushed generation) re-paid
object-store IO on every query for metadata that is never cached:

1. `posting_len_for_token` / `posting_metadata_for_token` issued one tiny
   single-row `read_range` per term per partition. For a small index, bulk-
   load the whole posting metadata once into the cached `OnceCell` instead
   (`small_index_bulk_metadata`, ≤256Ki tokens); large indexes keep the
   uncached single-row path.
2. `DeferredDocSet::resolve_row_ids` did targeted (uncached) row-id reads
   every query. For a small partition (≤256Ki docs) load and cache the whole
   ROW_ID column instead.
3. The "is this a vector index?" file-existence probe for indexes without
   `files` metadata issued a HEAD per generic open. Memoize it per uuid in
   the session index cache (`IsVectorIndexProbeKey`).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@github-actions github-actions Bot added performance A-index Vector index, linalg, tokenizer labels Jun 12, 2026
@hamersaw

Copy link
Copy Markdown
Contributor Author

This PR was fixing a performance issue on a path we should not have been on, closing accordingly.

@hamersaw hamersaw closed this Jun 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-index Vector index, linalg, tokenizer performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant