Skip to content

Refactor: optimize MD and NEP hot paths with OpenMP-ready loops#7518

Open
Audrey-777 wants to merge 4 commits into
deepmodeling:developfrom
Audrey-777:refactor/md-openmp-pr-v2
Open

Refactor: optimize MD and NEP hot paths with OpenMP-ready loops#7518
Audrey-777 wants to merge 4 commits into
deepmodeling:developfrom
Audrey-777:refactor/md-openmp-pr-v2

Conversation

@Audrey-777

Copy link
Copy Markdown

Summary

This PR refactors selected MD and NEP/DPMD hot paths and adds low-risk OpenMP-ready loop optimizations.

Main changes:

  • Parallelize independent per-atom MD update/helper loops.
  • Refactor MD kinetic/stress state handling into reusable helper APIs.
  • Reuse NEP/DPMD interface buffers to reduce repeated allocations.
  • Parallelize NEP coordinate fill, energy reduction, force copy-back, and virial reduction loops.
  • Refactor MD unit tests with a shared fixture and add direct coverage for MD state helpers.
  • Use std::unique_ptr for MD runner lifetime management.

This PR intentionally does not include LJ runner parallelization, MPI domain decomposition, CUDA work, benchmark scripts, or unrelated PW/Hamilt changes.

Validation

Local checks passed:

  • git diff --check upstream/develop...HEAD
  • OpenMP ON/OFF MD + ESolver builds
  • Selected MD unit tests:
    • MODULE_MD_func
    • MODULE_MD_LJ_pot
    • MODULE_MD_fire
    • MODULE_MD_verlet
    • MODULE_MD_nhc
    • MODULE_MD_msst

Performance Notes

A 2048-atom LJ NVE case on a 16-logical-CPU machine did not show stable end-to-end speedup:

  • 1 thread: 10.73s
  • 2 threads: 10.66s
  • 4 threads: 10.63s
  • 8 threads: 11.28s
  • 16 threads: 13.65s

This is expected because this PR does not parallelize the LJ runner, which remains the main full-MD bottleneck.

MPI-only and MPI+OpenMP hybrid tests were also slower than serial/OpenMP-only because the current MD/LJ path does not partition LJ force work across MPI ranks.

Copilot AI review requested due to automatic review settings June 25, 2026 18:39

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors several Molecular Dynamics (MD) hot paths and NEP/DPMD solver loops to be OpenMP-ready, while also consolidating MD unit test setup/teardown into shared fixtures and introducing small helper APIs for MD kinetic/stress state computation.

Changes:

  • Added OpenMP-ready reductions/loops in MD helpers and per-atom update paths, and introduced MDKineticState / MDStressState helper APIs.
  • Reduced per-step allocations in NEP/DPMD by reusing interface buffers and parallelized several NEP loops (coord fill, energy/virial reductions, force copy-back).
  • Refactored MD unit tests to use shared fixtures (md_test_fixture.h) and migrated runner ownership to std::unique_ptr in tests and run_md.cpp.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
source/source_md/test/verlet_test.cpp Switches Verlet tests to shared integrator fixture.
source/source_md/test/nhchain_test.cpp Uses shared MD test base + unique_ptr runner ownership.
source/source_md/test/msst_test.cpp Switches MSST tests to shared integrator fixture and updates casts for unique_ptr.
source/source_md/test/md_test_fixture.h Adds shared gtest fixtures for MD integrators, MD_func, and LJ pot tests.
source/source_md/test/md_func_test.cpp Uses shared fixture; adds direct tests for new kinetic/stress state helpers.
source/source_md/test/lj_pot_test.cpp Uses shared fixture and migrates esolver ownership to unique_ptr.
source/source_md/test/langevin_test.cpp Switches Langevin tests to shared integrator fixture.
source/source_md/test/fire_test.cpp Switches FIRE tests to shared integrator fixture and updates casts for unique_ptr.
source/source_md/test/CMakeLists.txt Updates MD test build dependency list (per diff).
source/source_md/run_md.cpp Introduces unique_ptr MD runner creation and removes manual delete.
source/source_md/md_statistics.h Adds MDKineticState / MDStressState structs.
source/source_md/md_func.h Declares new state helper APIs and includes md_statistics.h.
source/source_md/md_func.cpp Adds OpenMP-ready loops/reductions; implements state helpers; refactors stress/temp computations.
source/source_md/md_base.cpp Adds OpenMP-ready per-atom loops for update_pos / update_vel.
source/source_esolver/esolver_nep.h Adds persistent NEP buffers and atom index mapping fields.
source/source_esolver/esolver_nep.cpp Reuses NEP buffers; adds OpenMP-ready reductions and loop parallelism.
source/source_esolver/esolver_dp.h Adds persistent DP buffers for cell/coord and raw model outputs.
source/source_esolver/esolver_dp.cpp Reuses DP buffers and avoids repeated temporary vector allocations.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 99 to 103
{
if (cal_stress)
{
ModuleBase::matrix t_vector;

temp_vector(unit_in.nat, vel, allmass, t_vector);

for (int i = 0; i < 3; ++i)
{
for (int j = 0; j < 3; ++j)
{
stress(i, j) = virial(i, j) + t_vector(i, j) / unit_in.omega;
}
}
stress = calc_stress_state(unit_in.nat, unit_in.omega, vel, allmass, virial).stress;
}

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. compute_stress() now fills the caller-provided stress matrix in place again instead of move-assigning from calc_stress_state(...).stress, so the hot path keeps reusing the existing 3x3 storage. calc_stress_state() remains available for tests/value-returning callers.

Comment thread source/source_md/test/md_test_fixture.h Outdated
Comment on lines +10 to +11
#include <memory>
#include <vector>

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. md_test_fixture.h now temporarily undefines private / protected before including gtest, , and , then restores the macros before including project headers, keeping the access-control hack scoped to project code.

Comment on lines +79 to +81
temp_vector(natom, vel, allmass, state.temperature_tensor);
state.stress.create(3, 3);

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. calc_stress_state() now uses state.stress.create(3, 3, false) because all 9 entries are filled immediately.

Comment thread source/source_md/md_func.cpp Outdated
@@ -475,17 +506,45 @@
{
t_vector.create(3, 3);

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. temp_vector() now uses t_vector.create(3, 3, false) because all 9 entries are overwritten before return.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants