Skip to content

[Question]: Potential ncclIntruQueueTransfer bug + memory leak? #2204

@miEsMar

Description

@miEsMar

Question

I am working with NCCL to test some new hardware functionalities, and stumbled upon the ncclIntruQueueTransfer function:

template<typename T, T *T::*next>
void ncclIntruQueueTransfer(ncclIntruQueue<T,next> *dst, ncclIntruQueue<T,next> *src) {
  (dst->tail ? dst->tail->next : dst->head) = src->head;
  if (src->tail) dst->tail = src->tail;
  src->head = nullptr;
  src->tail = nullptr;
}

And noticed a potential bug + memory leak.
Specifically, line 2, if (src->tail) dst->tail = src->tail;, which unconditionally writes dst->tail regardless of the possible assignment in line 1, in case dst->tail is not a NULL pointer, which leads to completely lose src->head, if not NULL.

If this is the case, let me know if you welcome a PR suggesting a potential fix for it.

Metadata

Metadata

Assignees

Labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions