feat(dspark): add Ascend NPU support for Qwen3.5-4B DSpark training by curnane-lab · Pull Request #617 · sgl-project/SpecForge

curnane-lab · 2026-06-29T06:44:28Z

Summary

This PR adds Ascend NPU training support for DSpark on Qwen3.5-4B.

Note on scope: The branch currently contains two commits. The first commit (a2f18ea, "feat: DSpark trainer") is borrowed from the preceding DSpark trainer PR and provides the base DSpark implementation. This PR's own incremental change is the second commit (5776d51), which adds the NPU example script and the flex_attention -> SDPA fallback in the trainer.

What is added (incremental)

1. NPU training launcher

examples/run_qwen3.5_4b_dspark_online_npu.sh
- Sets ASCEND_RT_VISIBLE_DEVICES and PYTORCH_NPU_ALLOC_CONF.
- Uses --attention-backend sdpa and --target-model-backend hf (HF backend always surfaces last_hidden_states, which DSpark's L1 / confidence losses require).
- Uses HCCL via torchrun --standalone.

2. Trainer NPU fallback

scripts/train_dspark.py
- Auto-detects Ascend NPU and falls back from flex_attention to sdpa when the default backend would fail on NPU.

DSpark background (for context)

DSpark = SpecForge's DFlash block-diffusion drafter + EAGLE-style Markov & confidence heads, trained with:

Cross-entropy against ground-truth next tokens.
L1 distribution distillation using the target model's final hidden state.
Confidence-head BCE against the empirical per-token accept rate.

The base trainer implementation is in the preceding commit (a2f18ea). This PR only layers the NPU enablement on top.

Usage

export TARGET_MODEL_PATH=/path/to/Qwen3.5-4B
export TRAIN_DATA_PATH=/path/to/train.jsonl
bash examples/run_qwen3.5_4b_dspark_online_npu.sh 0,1,2,3,4,5,6,7

Checklist

Format your code according to the Code Formatting with Pre-Commit.
Add unit tests as outlined in the Running Unit Tests.
Update documentation / docstrings / example tutorials as needed, according to Writing Documentation.
Provide throughput / latency benchmark results and accuracy evaluation results as needed, according to Benchmark and Profiling and Accuracy Results.
For reviewers: If you haven't made any contributions to this PR and are only assisting with merging the main branch, please remove yourself as a co-author when merging the PR.
Please feel free to join our Slack channel at https://sgl-fru7574.slack.com/archives/C09784E3EN6 to discuss your PR.

…ion) Port of TorchSpec PR sgl-project#129 to SpecForge. Adds: - specforge/modeling/draft/dspark.py: DSparkConfig, VanillaMarkov, AcceptRatePredictor, DSparkDraftModel (subclass of DFlashDraftModel) - specforge/core/dspark.py: OnlineDSparkModel (subclass of OnlineDFlashModel) with Markov-biased logits + CE + L1 distribution distillation + confidence BCE and a pooled global-mean loss - scripts/train_dspark.py: training driver (clone of train_dflash.py) - configs/qwen3-8b-dspark.json, examples/run_qwen3_8b_dspark_online.sh - last_hidden_states surfaced from the DFlash target backends (HF + sglang) - tests/test_utils/test_dspark.py: 11 CPU unit tests Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

gemini-code-assist · 2026-06-29T06:44:45Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

maocheng23 and others added 2 commits June 29, 2026 10:41

feat(dspark): add Ascend NPU support for Qwen3.5-4B DSpark training

5776d51

curnane-lab requested review from FlamingoPg, FrankLeeeee, shuaills and sleepcoo as code owners June 29, 2026 06:44

curnane-lab changed the title ~~Dspark npu~~ add npu support for Dspark Jun 29, 2026

curnane-lab changed the title ~~add npu support for Dspark~~ feat(dspark): add Ascend NPU support for Qwen3.5-4B DSpark training Jun 29, 2026

curnane-lab force-pushed the dspark_npu branch 3 times, most recently from f6ed937 to d72ded8 Compare June 29, 2026 08:13

fix lint

a3fa6fc

curnane-lab force-pushed the dspark_npu branch from d72ded8 to a3fa6fc Compare June 29, 2026 08:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(dspark): add Ascend NPU support for Qwen3.5-4B DSpark training#617

feat(dspark): add Ascend NPU support for Qwen3.5-4B DSpark training#617
curnane-lab wants to merge 3 commits into
sgl-project:mainfrom
curnane-lab:dspark_npu

curnane-lab commented Jun 29, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

curnane-lab commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What is added (incremental)

1. NPU training launcher

2. Trainer NPU fallback

DSpark background (for context)

Usage

Checklist

Uh oh!

gemini-code-assist Bot commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

curnane-lab commented Jun 29, 2026 •

edited

Loading