Fix ServerMaintenance transitions to Pending when referenced Server is deleted#951
Fix ServerMaintenance transitions to Pending when referenced Server is deleted#951stefanhipfel wants to merge 1 commit into
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthrough
ChangesServerMaintenance Pending status on missing Server
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
internal/controller/servermaintenance_controller.go (1)
76-76: ⚡ Quick winAdd structured context fields to the orphan-deletion log.
This message should include object keys (for example,
ServerMaintenanceand referencedServer) to match controller logging conventions.Suggested change
- log.V(1).Info("Referenced Server not found, deleting ServerMaintenance") + log.V(1).Info( + "Referenced Server not found, deleting ServerMaintenance", + "ServerMaintenance", client.ObjectKeyFromObject(maintenance), + "Server", maintenance.Spec.ServerRef.Name, + )As per coding guidelines:
Use structured logging with key-value pairs following Kubernetes conventions - use log := log.FromContext(ctx); log.Info("msg", "key", val).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/controller/servermaintenance_controller.go` at line 76, The log statement at the orphan-deletion check lacks structured context fields required by Kubernetes logging conventions. Modify the log.V(1).Info() call to include key-value pairs for the ServerMaintenance object key and the referenced Server object key (for example, using "namespace" and "name" or similar identifying fields). Keep the message text the same but add the structured fields as additional arguments to the Info() method following the pattern log.Info("msg", "key", val, "key2", val2).Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/controller/servermaintenance_controller_test.go`:
- Around line 575-578: The test assertions using
Eventually(Get(server)).ShouldNot(Succeed()) and
Eventually(Get(serverMaintenance)).ShouldNot(Succeed()) are too loose and can
pass for unrelated API failures. Replace these checks to explicitly verify that
the resources are not found by using apierrors.IsNotFound to confirm the
intended deletion outcome instead of only checking that the Get operation
failed. This change needs to be applied at lines 575-578 (for the server
resource check) and at lines 605-611 (for the serverMaintenance resource check)
in the servermaintenance_controller_test.go file.
---
Nitpick comments:
In `@internal/controller/servermaintenance_controller.go`:
- Line 76: The log statement at the orphan-deletion check lacks structured
context fields required by Kubernetes logging conventions. Modify the
log.V(1).Info() call to include key-value pairs for the ServerMaintenance object
key and the referenced Server object key (for example, using "namespace" and
"name" or similar identifying fields). Keep the message text the same but add
the structured fields as additional arguments to the Info() method following the
pattern log.Info("msg", "key", val, "key2", val2).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a3338e7a-d4e1-4ca1-8282-8f65eab0548c
📒 Files selected for processing (2)
internal/controller/servermaintenance_controller.gointernal/controller/servermaintenance_controller_test.go
| Eventually(Get(server)).ShouldNot(Succeed()) | ||
|
|
||
| By("Expecting the ServerMaintenance to be deleted automatically") | ||
| Eventually(Get(serverMaintenance)).ShouldNot(Succeed()) |
There was a problem hiding this comment.
Assert NotFound explicitly instead of only “not succeed”.
These checks can pass on unrelated API failures; assert apierrors.IsNotFound to verify the intended deletion outcome.
Suggested change
+ apierrors "k8s.io/apimachinery/pkg/api/errors"
...
- Eventually(Get(server)).ShouldNot(Succeed())
+ Eventually(Get(server)).Should(Satisfy(apierrors.IsNotFound))
...
- Eventually(Get(serverMaintenance)).ShouldNot(Succeed())
+ Eventually(Get(serverMaintenance)).Should(Satisfy(apierrors.IsNotFound))
...
- Eventually(Get(server)).ShouldNot(Succeed())
+ Eventually(Get(server)).Should(Satisfy(apierrors.IsNotFound))
...
- Eventually(Get(serverMaintenance)).ShouldNot(Succeed())
+ Eventually(Get(serverMaintenance)).Should(Satisfy(apierrors.IsNotFound))Also applies to: 605-611
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@internal/controller/servermaintenance_controller_test.go` around lines 575 -
578, The test assertions using Eventually(Get(server)).ShouldNot(Succeed()) and
Eventually(Get(serverMaintenance)).ShouldNot(Succeed()) are too loose and can
pass for unrelated API failures. Replace these checks to explicitly verify that
the resources are not found by using apierrors.IsNotFound to confirm the
intended deletion outcome instead of only checking that the Get operation
failed. This change needs to be applied at lines 575-578 (for the server
resource check) and at lines 605-611 (for the serverMaintenance resource check)
in the servermaintenance_controller_test.go file.
nagadeesh-nagaraja
left a comment
There was a problem hiding this comment.
seems like we both kind of working on similar topic:
I believe that the deletion should not be forced if in case its caused by flaky issue.
the deletion should be handled by the creator of the maintenance? let me know what you think.
| log.V(1).Info("Referenced Server not found, deleting ServerMaintenance") | ||
| if err := r.Delete(ctx, maintenance); err != nil { | ||
| return ctrl.Result{}, client.IgnoreNotFound(err) | ||
| } |
There was a problem hiding this comment.
dont you think this might cause any flaky issue to delete the maintenances?
I think the owner of the servermaintenance should be responsible for deleting the servermaintenance.
There was a problem hiding this comment.
why should we leave orphaned maintenances? Same way we don't leave orphaned servers around after their bmc got deleted.
No one will clean them up!
There was a problem hiding this comment.
maintenances are mostly created by someone..
example when BMCSettings or other similar operators creates it, it needs to be deleted by them as they are the owner of it. they should detect the ref server or BMC is gone and delete themself and hence delete the maintenance as well. (this is already done when the SET CRD detects the server is gone)
if its created by the user (human), then we can delete them. as its easier for user. may be we can check if it has a owner ref, and if so do not delete it.. let the owner handle it?
afritzler
left a comment
There was a problem hiding this comment.
I would suggest not to delete resources in the reconciliation flow: Following analogy: if you create a Pod with spec.nodeName and the Node is not there, nobody will in k8s delete the Pod. It will transition into a Pending state. For our ServerMaintenance I would suggest that whoever creates the ServerMaintenance should be responsible deleting it.
I don't fully agree, since people just don't delete their servermaintenances or just forget it. Can we at least put a different state to the serverMaintenance object? Otherwise it would keep |
c866f34 to
50e70b0
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (3)
internal/controller/servermaintenance_controller_test.go (3)
563-563:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAssert
NotFoundexplicitly instead of only "not succeed".The check can pass on unrelated API failures; assert
apierrors.IsNotFoundto verify the intended deletion outcome.🔍 Proposed fix
+ apierrors "k8s.io/apimachinery/pkg/api/errors" ... - Eventually(Get(server)).ShouldNot(Succeed()) + Eventually(Get(server)).Should(Satisfy(apierrors.IsNotFound))🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/controller/servermaintenance_controller_test.go` at line 563, The assertion using `Eventually(Get(server)).ShouldNot(Succeed())` is too broad and will pass when the Get operation fails for any reason (network errors, permissions issues, etc.), not just when the resource is deleted. Replace this generic success check with an explicit assertion using `apierrors.IsNotFound` to verify that the Get operation fails specifically because the resource was not found, ensuring the test validates the intended deletion outcome rather than just any API failure.
591-591:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAssert
NotFoundexplicitly instead of only "not succeed".These checks can pass on unrelated API failures; assert
apierrors.IsNotFoundto verify the intended deletion outcome.🔍 Proposed fix
+ apierrors "k8s.io/apimachinery/pkg/api/errors" ... - Eventually(Get(server)).ShouldNot(Succeed()) + Eventually(Get(server)).Should(Satisfy(apierrors.IsNotFound)) ... - Eventually(Get(serverMaintenance)).ShouldNot(Succeed()) + Eventually(Get(serverMaintenance)).Should(Satisfy(apierrors.IsNotFound))Also applies to: 597-597
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/controller/servermaintenance_controller_test.go` at line 591, The test assertions at line 591 and line 597 in internal/controller/servermaintenance_controller_test.go currently only verify that the Get operation fails, but do not confirm the failure is specifically a NotFound error. Replace the ShouldNot(Succeed()) assertions with explicit checks using apierrors.IsNotFound to verify that the server object is actually deleted and returns a 404 NotFound error, rather than failing due to other unrelated API failures.
637-637:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAssert
NotFoundexplicitly instead of only "not succeed".The check can pass on unrelated API failures; assert
apierrors.IsNotFoundto verify the intended deletion outcome.🔍 Proposed fix
+ apierrors "k8s.io/apimachinery/pkg/api/errors" ... - Eventually(Get(serverMaintenance)).ShouldNot(Succeed()) + Eventually(Get(serverMaintenance)).Should(Satisfy(apierrors.IsNotFound))🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/controller/servermaintenance_controller_test.go` at line 637, The assertion at the Get(serverMaintenance) call only verifies that the operation did not succeed, but this can pass for any API failure, not specifically deletion. Replace the ShouldNot(Succeed()) check with an explicit assertion using apierrors.IsNotFound to verify that the serverMaintenance resource was actually deleted and not just failing for an unrelated reason. This ensures the test validates the intended deletion outcome rather than any arbitrary API failure.
🧹 Nitpick comments (1)
internal/controller/servermaintenance_controller_test.go (1)
566-566: 💤 Low valueRemove redundant
Equal()wrapper.
HaveFieldalready performs equality checking, so wrapping the expected value inEqual()is unnecessary.♻️ Simplification
- Eventually(Object(serverMaintenance)).Should(HaveField("Status.State", Equal(metalv1alpha1.ServerMaintenanceStatePending))) + Eventually(Object(serverMaintenance)).Should(HaveField("Status.State", metalv1alpha1.ServerMaintenanceStatePending))🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/controller/servermaintenance_controller_test.go` at line 566, Remove the redundant Equal() wrapper from the HaveField assertion. In the Eventually(Object(serverMaintenance)).Should(HaveField(...)) call, HaveField already performs equality checking internally, so pass metalv1alpha1.ServerMaintenanceStatePending directly as the second argument to HaveField instead of wrapping it with Equal().
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@internal/controller/servermaintenance_controller_test.go`:
- Line 563: The assertion using `Eventually(Get(server)).ShouldNot(Succeed())`
is too broad and will pass when the Get operation fails for any reason (network
errors, permissions issues, etc.), not just when the resource is deleted.
Replace this generic success check with an explicit assertion using
`apierrors.IsNotFound` to verify that the Get operation fails specifically
because the resource was not found, ensuring the test validates the intended
deletion outcome rather than just any API failure.
- Line 591: The test assertions at line 591 and line 597 in
internal/controller/servermaintenance_controller_test.go currently only verify
that the Get operation fails, but do not confirm the failure is specifically a
NotFound error. Replace the ShouldNot(Succeed()) assertions with explicit checks
using apierrors.IsNotFound to verify that the server object is actually deleted
and returns a 404 NotFound error, rather than failing due to other unrelated API
failures.
- Line 637: The assertion at the Get(serverMaintenance) call only verifies that
the operation did not succeed, but this can pass for any API failure, not
specifically deletion. Replace the ShouldNot(Succeed()) check with an explicit
assertion using apierrors.IsNotFound to verify that the serverMaintenance
resource was actually deleted and not just failing for an unrelated reason. This
ensures the test validates the intended deletion outcome rather than any
arbitrary API failure.
---
Nitpick comments:
In `@internal/controller/servermaintenance_controller_test.go`:
- Line 566: Remove the redundant Equal() wrapper from the HaveField assertion.
In the Eventually(Object(serverMaintenance)).Should(HaveField(...)) call,
HaveField already performs equality checking internally, so pass
metalv1alpha1.ServerMaintenanceStatePending directly as the second argument to
HaveField instead of wrapping it with Equal().
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 870e4f3b-e13e-49a3-bb45-9b68247bb783
📒 Files selected for processing (2)
internal/controller/servermaintenance_controller.gointernal/controller/servermaintenance_controller_test.go
When the referenced Server is not found, transition ServerMaintenance to Pending instead of deleting it. This follows the Kubernetes PVC pattern where a resource waits in Pending when its dependency is missing, and keeps the creator responsible for cleanup. Signed-off-by: Stefan Hipfel <stefan.hipfel@sap.com>
50e70b0 to
f3ae5a8
Compare
Fixes #367
When a BMC is deleted, Kubernetes cascades deletion to owned
Serverresources.ServerMaintenanceobjects referencing those servers were left orphaned due to two bugs in theServerMaintenanceReconciler:reconcile()returned silently onNotFoundserver before adding the finalizer — so the resource was never garbage collecteddelete()propagatedNotFoundas an error, blocking finalizer removal permanentlyBoth paths now handle a missing
Servergracefully.Summary by CodeRabbit
Bug Fixes
Tests