Skip to content

Fix rma proxy connect cleanup#2205

Open
wanglei875 wants to merge 1 commit into
NVIDIA:devfrom
wanglei875:fix-rma-proxy-connect-cleanup
Open

Fix rma proxy connect cleanup#2205
wanglei875 wants to merge 1 commit into
NVIDIA:devfrom
wanglei875:fix-rma-proxy-connect-cleanup

Conversation

@wanglei875

Copy link
Copy Markdown
Contributor

Description

Summary

Clean up partially initialized RMA proxy resources when ncclRmaProxyConnectOnce() fails.

The failure path now closes:

  • the current listen communicator if listen() succeeded
  • any created RMA proxy contexts
  • any established RMA collective communicators

Also routes rmaProxyCtxs allocation through the existing failure cleanup path.

Testing

  • git diff --check -- src/rma/rma_proxy.cc
  • make -j$(nproc) src.build compiled rma/rma_proxy.cc; full build later failed on an existing missing plugin_cleanup.h include in plugin/rma/rma_v13.cc.

Related Issues

Changes & Impact

Performance Impact

Copilot AI review requested due to automatic review settings June 4, 2026 07:22

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review this pull request because it exceeds the maximum number of files (300). Try reducing the number of changed files and requesting a review from Copilot again.

@wanglei875 wanglei875 changed the base branch from master to dev June 4, 2026 07:27
@wanglei875 wanglei875 force-pushed the fix-rma-proxy-connect-cleanup branch from edf9459 to 5a4fcb1 Compare June 4, 2026 07:32
Signed-off-by: WangLei <1539790288@qq.com>
@xiaofanl-nvidia

Copy link
Copy Markdown
Collaborator

++ @jynv to review and mirror if it looks good.

@xiaofanl-nvidia xiaofanl-nvidia requested a review from jynv June 7, 2026 00:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants