Skip to content

feat(multiplex-quant): expose --sample-bc-ori as a CLI override#199

Open
an-altosian wants to merge 2 commits into
COMBINE-lab:mainfrom
an-altosian:feat/multiplexquant-sample-bc-ori-cli
Open

feat(multiplex-quant): expose --sample-bc-ori as a CLI override#199
an-altosian wants to merge 2 commits into
COMBINE-lab:mainfrom
an-altosian:feat/multiplexquant-sample-bc-ori-cli

Conversation

@an-altosian
Copy link
Copy Markdown
Contributor

@an-altosian an-altosian commented May 29, 2026

Summary

Adds --sample-bc-ori {forward,reverse} to simpleaf multiplex-quant, completing the precedence pattern that --expected-ori and --geometry already follow:

  1. user-supplied CLI value (this PR)
  2. chemistry preset's declared sample_bc_ori
  3. omit the flag (alevin-fry default = forward)

Resolves the use case in #198: cycle-plan variants like 10x Flex Configuration B (R1=28 / R2=90) where the sample BC is read from the opposite strand. Users can now run such variants with --chemistry 10x-flexv2-gex-3p --geometry '...' --sample-bc-ori forward without adding a new preset entry.

Implementation notes

  • CLI vocabulary: forward / reverse — matches the chemistry preset JSON ("sample_bc_ori": "forward" | "reverse") and alevin-fry's own --sample-bc-ori flag (verified in alevin-fry main.rs).
  • No translation layer: The CLI value is forwarded verbatim to alevin-fry's generate-permit-list --sample-bc-ori.
  • Preset JSON untouched: When the override comes from the chemistry preset's Option<String> field, it passes through verbatim. No breaking change to existing chemistries.json.
  • No both: alevin-fry's --sample-bc-ori only accepts forward/reverse. No current preset declares both either. Adding both would mean inventing semantics that don't exist downstream.

Validation

End-to-end verified by running the same internal Config B Flex library two ways and confirming byte-identical results:

  • Path A (preset-driven): --chemistry 10x-flexv2-gex-3p-config-b (from the companion preset PR feat(chemistries): add 10x-flexv2-gex-3p-config-b preset #201, which declares sample_bc_ori: "forward").
  • Path B (CLI-override): --chemistry 10x-flexv2-gex-3p (which declares sample_bc_ori: "reverse" in its JSON) + --sample-bc-ori forward + matching --geometry override at the CLI.

Both runs produced identical per-sample-well cell counts and read counts, identical wall-clock time, and sample_bc_ori: Forward in the resulting sample_info.json. That demonstrates:

  1. The CLI value reaches alevin-fry generate-permit-list correctly (the demultiplexing wells light up, instead of producing the noise-floor result the preset's declared reverse would yield on Config B reads).
  2. The precedence rule is correct: when CLI and preset disagree, the CLI value wins.

CI status note

The check_formatting job fails on this PR with a SIGILL from alevin-fry --version during test_simpleaf.sh. This is a pre-existing failure on upstream main since commit 52ccc73 (the 0.25.0 release): the bioconda alevin-fry 0.15.0 binary appears to use CPU instructions not supported by the GHA runner. My PR does not touch any code path involved in the failure (set-paths runs before any --sample-bc-ori code is exercised). Suggested fixes (cargo install from source in CI, or a bioconda rebuild) are out of scope for this PR; happy to send them separately if desired.

Test plan

  • cargo check passes locally on the patched source.
  • cargo build --release produces a working binary that runs through multiplex-quant.
  • End-to-end run with --sample-bc-ori forward on a real Config B Flex library yields realistic per-sample-well cell counts (matches cellranger multi ground truth at the well level).
  • Result is byte-identical to the companion preset path in PR feat(chemistries): add 10x-flexv2-gex-3p-config-b preset #201 (rules out silent path divergence).
  • sample_bc_ori is recorded in sample_info.json as the value passed at the CLI, confirming it flowed through to alevin-fry generate-permit-list.
  • check_formatting CI job (pre-existing SIGILL on upstream main; see CI status note above).

Refs

Add `--sample-bc-ori {fw,rev}` to `simpleaf multiplex-quant` so users can
override the chemistry preset's sample barcode orientation without forking
the preset. Useful for cycle-plan variants (e.g. 10x Flex Configuration B)
where the sample BC is read from the opposite strand vs the canonical preset.

Precedence (matches existing `--expected-ori` / `--geometry` patterns):
  1. user-supplied `--sample-bc-ori` CLI value
  2. chemistry preset's declared `sample_bc_ori` field
  3. omit the flag (alevin-fry default = forward)

The CLI uses `fw`/`rev` shorthand mirroring `--expected-ori`; these are
translated to alevin-fry's `forward`/`reverse` vocabulary before forwarding.
Preset JSON values pass through untouched, so chemistries.json is unaffected.

Refs: COMBINE-lab#198
an-altosian added a commit to an-altosian/simpleaf that referenced this pull request May 29, 2026
The published RTD docs were stuck at 0.19.0 in the page title. Bumps
conf.py:release to match Cargo.toml (0.25.0).

While in the file, address several long-standing gaps in
flex-quant-command.rst's `-h` snippet (and its surrounding prose) that
accumulated between 0.19.0 and 0.25.0:

- Add missing CLI flags to the help snippet: --geometry, --dict,
  --sample-correction-mode.
- Move --sample-bc-list from "Probe Set Options" to "Reference Options"
  to match the actual help_heading in the source.
- Carve --resolution out into a dedicated "Quantification Options"
  section, matching its help_heading.
- Soften the Overview "needs" list to reflect that --chemistry is now
  optional when --geometry + --cell-bc-list are supplied (cycle-plan
  variants like 10x Flex Configuration B).
- Rewrite the intro paragraph to call out the chemistry-vs-manual-override
  modes explicitly.

Refs: COMBINE-lab#199 (--sample-bc-ori companion code PR)
Per design discussion: keep the CLI vocabulary consistent with the
chemistry preset JSON (`sample_bc_ori: "forward" | "reverse"`) and with
alevin-fry's own `--sample-bc-ori` flag (also `forward` / `reverse`).
Drop the `fw`/`rev` shorthand and the in-process translation layer.
an-altosian added a commit to an-altosian/simpleaf that referenced this pull request May 29, 2026
…here

Per design discussion on PR COMBINE-lab#199: keep the CLI vocabulary consistent with
the chemistry preset JSON and with alevin-fry's --sample-bc-ori. Update
the -h snippet, "Chemistry preset structure" bullet, "Sample barcode
orientation" prose, and both example commands.
@rob-p
Copy link
Copy Markdown
Contributor

rob-p commented May 29, 2026

The CI failure is strange. It suggests an illegal instruction on the CI runner (which is pulling alevin-fry from bioconda). I'm not sure what to do about that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants