Skip to content

refactor: extract shard allocator into a dedicated crate#3578

Open
jayakasadev wants to merge 6 commits into
apache:masterfrom
jayakasadev:move_shard_allocator_to_server_common
Open

refactor: extract shard allocator into a dedicated crate#3578
jayakasadev wants to merge 6 commits into
apache:masterfrom
jayakasadev:move_shard_allocator_to_server_common

Conversation

@jayakasadev

@jayakasadev jayakasadev commented Jun 28, 2026

Copy link
Copy Markdown
Contributor

Which issue does this PR address?

Relates to #3315

Rationale

  • decoupling core/server-ng from the legacy core/server crate

What changed?

  • allocator moved to shard_allocator crate that only server and server-ng depend on

    • when I moved allocator to server_common, hwlocality leaked everywhere and broke the aarch64-musl build
      • simulator links with -nodefaultlibs
      • hwloc drags in glibc's libm.a, failing on an undefined reference to errno
    • server-ng had to import server crate to use shard allocator (server::shard_allocator) before this
  • CpuAllocation/NumaConfig moved to a dedicated cpu_allocation leaf crate

    • depends only on serde, no hwloc
    • shard_allocator depends on cpu_allocation (not configs or server_common), so the allocator stays light and hwloc never leaks back into configs
    • configs::sharding re-exports them to keep the configs::sharding::* path stable for existing callers
  • allocator lives in its own shard_allocator crate that only server and server-ng depend on

    • when I moved allocator to server_common, hwlocality leaked everywhere and broke the aarch64-musl build
      • simulator links with -nodefaultlibs
      • vendored hwloc drags in glibc's libm.a, failing on undefined references to glibc-internal math symbols
    • server-ng previously had to depend on the server crate just to reach the allocator (server::shard_allocator)
  • CpuAllocation/NumaConfig moved to a dedicated cpu_allocation leaf crate

    • depends only on serde, no hwloc
    • shard_allocator depends on cpu_allocation (not configs or server_common), so the allocator stays light and hwloc never leaks back into configs
    • configs::sharding re-exports them to keep the configs::sharding::* path stable for existing callers
  • shard_allocator carries a tiny, musl-gated build script that drops an empty libm.a stub on the link search path

    • vendored hwloc references cbrt, which forces the linker to resolve -lm; musl folds math into libc, so rust's musl sysroot ships no libm.a and -lm would otherwise fall through to glibc's libm.a (whose cbrt needs glibc-only __frexp/__ldexp, absent on musl)
    • the stub satisfies -lm harmlessly so cbrt resolves from musl's own libc, fixing the link independent of crate ordering

Local Execution

  • Passed (macos); Linux/musl builds left to CI
  • Pre-commit hooks ran

AI Usage

  1. Tools: Cursor (Claude)
  2. Scope: planned and performed the full relocation
  3. Verification:
  • ran fmt/sort/clippy + unit tests locally
  • confirmed rg 'server::shard_allocator' core/server-ng/ is empty and that nothing was broken by the move
  1. Yes, I can explain every line

@github-actions

Copy link
Copy Markdown

Thanks for the PR. It is labeled S-waiting-on-review and queued for review.

Slash commands (own line, regular comment) move it around the queue:

  • /ready - back to S-waiting-on-review after addressing feedback
  • /author - flip to S-waiting-on-author while you finish changes
  • /request-review @user-or-team - request a reviewer

See CONTRIBUTING.md for details.

@github-actions github-actions Bot added the S-waiting-on-review PR is waiting on a reviewer label Jun 28, 2026
@codecov

codecov Bot commented Jun 28, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 93.60731% with 14 lines in your changes missing coverage. Please review.
✅ Project coverage is 60.15%. Comparing base (688d32a) to head (785f528).

Files with missing lines Patch % Lines
core/shard_allocator/src/lib.rs 8.33% 10 Missing and 1 partial ⚠️
core/cpu_allocation/src/lib.rs 98.55% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@              Coverage Diff              @@
##             master    #3578       +/-   ##
=============================================
- Coverage     74.35%   60.15%   -14.20%     
  Complexity      937      937               
=============================================
  Files          1257     1256        -1     
  Lines        130671   118917    -11754     
  Branches     106537    94829    -11708     
=============================================
- Hits          97156    71532    -25624     
- Misses        30434    44193    +13759     
- Partials       3081     3192      +111     
Components Coverage Δ
Rust Core 57.18% <93.60%> (-17.88%) ⬇️
Java SDK 62.44% <ø> (ø)
C# SDK 71.40% <ø> (-0.71%) ⬇️
Python SDK 88.88% <ø> (ø)
PHP SDK 84.29% <ø> (ø)
Node SDK 91.26% <ø> (+0.14%) ⬆️
Go SDK 40.14% <ø> (ø)
Files with missing lines Coverage Δ
core/configs/src/server_config/sharding.rs 100.00% <ø> (+14.55%) ⬆️
core/configs/src/server_config/validators.rs 78.39% <ø> (ø)
core/server-ng/src/bootstrap.rs 10.86% <ø> (ø)
core/server-ng/src/server_error.rs 92.10% <ø> (ø)
core/server/src/bootstrap.rs 82.23% <ø> (ø)
core/server/src/main.rs 63.25% <ø> (ø)
core/server/src/server_error.rs 0.00% <ø> (ø)
core/cpu_allocation/src/lib.rs 98.55% <98.55%> (ø)
core/shard_allocator/src/lib.rs 15.38% <8.33%> (ø)

... and 241 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@jayakasadev jayakasadev changed the title Move shard allocator to server common refactor : Move shard allocator to server common Jun 28, 2026
@jayakasadev jayakasadev changed the title refactor : Move shard allocator to server common refactor: Move shard allocator to server common Jun 28, 2026
Moving the allocator into server_common pulled hwlocality into nearly the
entire workspace, since almost every crate depends on server_common. That
broke the aarch64-musl build: simulator links with -nodefaultlibs and the
vendored hwloc drags in glibc's libm.a, failing on an undefined reference
to errno.

Keep CpuAllocation and NumaConfig in server_common::sharding, which configs
re-exports and which carry no hwloc, and move the hwloc-backed ShardAllocator
into a new shard_allocator crate that only server and server-ng depend on.
This restores the confinement of hwloc to those two binaries that master had.

Co-authored-by: Cursor <cursoragent@cursor.com>
@jayakasadev jayakasadev changed the title refactor: Move shard allocator to server common refactor: extract shard allocator into a dedicated crate Jun 28, 2026
@jayakasadev

Copy link
Copy Markdown
Contributor Author

/request-review @hubcio

@github-actions github-actions Bot requested a review from hubcio June 29, 2026 20:13
jayakasadev and others added 4 commits June 29, 2026 22:02
The shard allocator extraction parked CpuAllocation/NumaConfig in
server_common, which forced a serde dependency onto a runtime crate and
coupled the lightweight allocator to server_common's heavy tree. Move
the two pure config types into a dedicated cpu_allocation leaf crate
that depends only on serde, re-export them from configs to keep the
configs::sharding path stable, and point shard_allocator at the leaf
crate. Also drop the whole-crate re-export shim in server::lib in favor
of direct paths, and document the new crate's public surface.

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Vendored hwloc references cbrt, forcing the linker to resolve -lm.
musl folds math into libc and rust's musl sysroot ships no libm.a, so
-lm fell through to the host glibc libm.a, whose cbrt needs glibc-only
__frexp/__ldexp symbols absent on musl, breaking the aarch64-musl build.

Drop an empty libm.a on the search path (musl-gated build script) so -lm
resolves to nothing and cbrt is satisfied later by musl's own libc. This
fixes the root cause independent of crate link ordering.

Co-authored-by: Cursor <cursoragent@cursor.com>
@jayakasadev jayakasadev force-pushed the move_shard_allocator_to_server_common branch from 785f528 to 9c53a6f Compare June 30, 2026 14:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-review PR is waiting on a reviewer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant