Skip to content

fix: return error instead of panicking on zero-dimension fixed-size-list columns#7247

Open
DanielMao1 wants to merge 1 commit into
lance-format:mainfrom
DanielMao1:fix/zero-dim-fsl-panic
Open

fix: return error instead of panicking on zero-dimension fixed-size-list columns#7247
DanielMao1 wants to merge 1 commit into
lance-format:mainfrom
DanielMao1:fix/zero-dim-fsl-panic

Conversation

@DanielMao1

@DanielMao1 DanielMao1 commented Jun 12, 2026

Copy link
Copy Markdown

Closes #5102

Problem

A fixed-size-list column with dimension 0 panics with attempt to divide by zero
(rust/lance-encoding/src/data.rs, FixedSizeListBlock::num_values). As of pylance 7.0.0
the panic fires on write for every storage version (stable/2.1/2.2), and reading
datasets persisted by older writers (which accepted such columns) panics as well.

Reproduction details are in the issue comment:
#5102 (comment)

Approach

Following the maintainer guidance in #5102 (error, not panic), this adds two small guards at
boundaries that already return Result, instead of changing DataBlock::num_values() to
return Result (the approach that made #5159 balloon across the whole encoding crate):

  1. Write side: Schema::validate() rejects zero-dimension fixed-size-list fields
    (including nested ones). validate() runs inside Schema::try_from(&ArrowSchema),
    so every write entry point surfaces a clean schema error instead of a panic. Writes
    currently panic on every storage version, so no working flow changes behavior.
  2. Read side (defensive): the structural and legacy field-scheduler builders reject
    zero-dimension fixed-size lists with an invalid-input error, so datasets persisted by
    old writers fail cleanly at scheduling time instead of crashing the process.

How the guards sit in the data flow

guards

Two facts that shape the design:

  • Schema::try_from(&ArrowSchema) calls validate() internally and every write path performs
    this conversion, so guard 1 in one place covers all write entry points.
  • Guard 2 exists because writers up to ~2026-04 could still persist zero-dimension columns
    under the stable (2.0) storage version; reading those files must not crash the process.

Tests

  • lance-core: Schema::try_from rejects zero-dim FSL at top level and nested in a struct;
    positive dimensions still validate.
  • lance-encoding: the scheduler guard rejects zero-dim FSL, including FSL-nested-in-FSL,
    and accepts positive dimensions.
  • Python: parametrized over legacy/stable/2.1, write_dataset now raises a clean
    OSError (same mapping as other schema validation errors) instead of PanicException.

@github-actions github-actions Bot added A-python Python bindings A-encoding Encoding, IO, file reader/writer bug Something isn't working labels Jun 12, 2026
@DanielMao1 DanielMao1 force-pushed the fix/zero-dim-fsl-panic branch 2 times, most recently from 143f1e1 to 240a45a Compare June 12, 2026 09:26
@codecov

codecov Bot commented Jun 12, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@yanghua yanghua left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left two comments.



@pytest.mark.parametrize("data_storage_version", ["legacy", "stable", "2.1"])
def test_write_zero_dimension_fixed_size_list(

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we at least add one test case about reading existed zero dimension by the old writer logic?

@DanielMao1 DanielMao1 Jun 15, 2026

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added as a Rust test rather than Python: once Schema::validate() rejects zero-dim FSL, no write path can produce such a dataset (and the encoder panics on zero-dim), so the fixture can't be generated by current code.
Instead, test_read_zero_dimension_fsl_errors_instead_of_panicking (in decoder.rs) drives create_structural_field_scheduler / create_legacy_field_scheduler — the read-plan builders the file reader calls against a stored schema — with a zero-dim FSL field, asserting a clean error instead of a panic.

Comment thread rust/lance-core/src/datatypes/schema.rs Outdated
Comment on lines +352 to +353
if let DataType::FixedSizeList(_, dimension) = field.data_type()
&& dimension <= 0

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it work for the FSL of FSL mode, e.g. FixedSizeList(FixedSizeList(Float32, 0), 4)?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — For my previous PR it did not. Because a FixedSizeList of a FixedSizeList over a primitive collapses into a single leaf field, the pre-order field walk never visits the inner list, and the original if let DataType::FixedSizeList(_, dimension) only matched the outermost dimension. So FixedSizeList(FixedSizeList(Float32, 0), 4) slipped through (outer dim 4 passed, inner dim 0 ignored).

I confirmed this with a probe test, then fixed it: the check now lives in a helper validate_fixed_size_list_dimensions() that recurses through nested list types, so an inner zero dimension is rejected too. Added FixedSizeList(FixedSizeList(Float32, 0), 4) (and a positive-dimension nested counterpart) to the schema test.

@DanielMao1 DanielMao1 force-pushed the fix/zero-dim-fsl-panic branch from 240a45a to 7827869 Compare June 15, 2026 11:00
…ist columns

A fixed-size-list column with dimension 0 used to panic with 'attempt to
divide by zero' (rust/lance-encoding/src/data.rs) on every write path and
when reading datasets persisted by older writers.

Two guards, following the maintainer guidance in lance-format#5102 (error, not panic):

- Schema::validate() rejects fixed-size-list fields whose dimension is not a
  positive integer, turning every write into a clean schema error.
  validate() is invoked from Schema::try_from(&ArrowSchema), so all write
  entry points are covered. A FixedSizeList of a FixedSizeList over a
  primitive collapses into a single leaf field, so the check recurses through
  nested list types to also reject an inner zero dimension, e.g.
  FixedSizeList(FixedSizeList(Float32, 0), 4).
- The decoder field-scheduler builders reject zero-dimension fixed-size
  lists with an invalid-input error, so datasets persisted by old writers
  fail cleanly instead of panicking at scheduling time.

Closes lance-format#5102
@DanielMao1 DanielMao1 force-pushed the fix/zero-dim-fsl-panic branch from 7827869 to 4d0d4b5 Compare June 15, 2026 11:02
@DanielMao1 DanielMao1 requested a review from yanghua June 15, 2026 11:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-encoding Encoding, IO, file reader/writer A-python Python bindings bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Zero dimension vectors cause Lance to panic

2 participants