Skip to content

perf/Optimize neighbor-list construction with OpenMP#7527

Open
Audrey-777 wants to merge 2 commits into
deepmodeling:developfrom
Audrey-777:perf/neighlist-build-openmp
Open

perf/Optimize neighbor-list construction with OpenMP#7527
Audrey-777 wants to merge 2 commits into
deepmodeling:developfrom
Audrey-777:perf/neighlist-build-openmp

Conversation

@Audrey-777

Copy link
Copy Markdown

Summary

This PR optimizes MD neighbor-list construction in BinManager::build_atom_neighbors().

For larger OpenMP runs, the neighbor-list build now uses a conservative count / allocate / fill flow:

  • Count per-atom neighbor sizes in parallel.
  • Allocate per-atom neighbor segments through the existing PageAllocator.
  • Fill neighbor IDs in parallel while preserving the existing bin scan order.

Small systems, single-thread runs, and OpenMP-disabled builds keep the serial path.

Correctness

The implementation preserves:

  • Per-atom neighbor order.
  • Existing PageAllocator ownership.
  • firstneigh_[i] == nullptr for zero-neighbor atoms.
  • C++11 compatibility.
  • OpenMP OFF behavior.

Performance

Environment: 8 physical cores / 16 logical CPUs, Intel Xeon Platinum 8163, np=1.

2048 atom LJ NVE, 200 steps

  • Wall time: 10.30 s -> 9.35 s, about 1.10x.
  • BinManager::build_atom_neighbors: 1.26 s -> 0.31 s, about 4.06x.

8192 atom LJ NVE, 100 steps

  • Wall time: 24.36 s -> 18.87 s, about 1.29x.
  • BinManager::build_atom_neighbors: 2.67 s -> 0.62 s, about 4.31x.

Repeat check

  • 2048 wall time: 11.16 s -> 10.13 s, about 1.10x.
  • 8192 wall time: 24.01 s -> 19.95 s, about 1.20x.

Final energy, potential, kinetic, temperature, and pressure matched across tested thread counts.

Tests

  • git diff --check
  • OpenMP ON focused build
  • OpenMP ON focused ctest: 4/4 passed
  • OpenMP OFF focused build
  • OpenMP OFF focused ctest: 4/4 passed
  • GitHub check on 6dba4951c: passed

Copilot AI review requested due to automatic review settings June 26, 2026 13:33

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves molecular-dynamics neighbor-list construction performance by introducing an OpenMP-enabled count/allocate/fill path in BinManager::build_atom_neighbors(), while keeping serial behavior for small/single-thread/OpenMP-OFF cases. It also adds lightweight timing instrumentation around key neighbor-search phases to make the runtime impact visible in profiling.

Changes:

  • Add an OpenMP parallel neighbor-list build path using per-atom counting, serial allocation via PageAllocator, then parallel fill while preserving neighbor order.
  • Factor out the bin-scan logic into a reusable BinManager::visit_neighbors() helper to avoid duplicating traversal code between serial and parallel paths.
  • Add ModuleBase::timer start/end instrumentation in NeighborSearch and BinManager hot-path methods.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
source/source_cell/module_neighlist/neighbor_search.cpp Adds timers around set_member_variables(), init(), and build_neighbors() to profile neighbor-search stages.
source/source_cell/module_neighlist/bin_manager.h Declares visit_neighbors() helper to share bin traversal logic used by both serial and OpenMP neighbor building.
source/source_cell/module_neighlist/bin_manager.cpp Implements the OpenMP count/allocate/fill neighbor build path, the shared visit_neighbors() traversal, and timers around binning/build steps.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@Audrey-777 Audrey-777 changed the title Perf/Optimize neighbor-list construction with OpenMP perf/Optimize neighbor-list construction with OpenMP Jun 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants