Skip to content

DPTP-2938: pod-scaler add cpu/memory authoritative max reduction flags#5254

Open
deepsm007 wants to merge 1 commit into
openshift:mainfrom
deepsm007:feat/authoritative-max-reduction-percent
Open

DPTP-2938: pod-scaler add cpu/memory authoritative max reduction flags#5254
deepsm007 wants to merge 1 commit into
openshift:mainfrom
deepsm007:feat/authoritative-max-reduction-percent

Conversation

@deepsm007

@deepsm007 deepsm007 commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Add configurable per-admission CPU/memory reduction caps for authoritative pod-scaler mode, defaulting to no cap in code with 25% set via release deployment args.

/cc @smg247

openshift/release#80602

Pod-Scaler: Configurable Resource Reduction Caps for Authoritative Mode

This PR updates the pod-scaler admission webhook’s authoritative mode so CPU and memory reductions are no longer limited by a single hardcoded cap. Instead, operators can configure independent maximum reduction limits per resource.

What Changed

  • The pod-scaler admission webhook now supports two new per-resource flags to cap how much it can reduce CPU and memory in authoritative mode:
    • --authoritative-cpu-max-reduction-percent (default: 1.0, i.e., no cap)
    • --authoritative-memory-max-reduction-percent (default: 1.0, i.e., no cap)
  • Input validation ensures both fractions are within [0, 1].
  • The reduction decision/mutation logic threads the CPU and memory max-reduction values through the authoritative selection path (including dry-run behavior).
  • Dry-run logging/telemetry was improved to indicate when a reduction was actually capped via the reduction_capped field.

Practical Impact for CI Operators

  • You can roll out authoritative scaling more safely by constraining how aggressive it can be.
  • The code defaults to no cap; the intended production rollout is to set a ~25% cap via release/deployment arguments.

Tests

  • Updated admission and authoritative reduction unit tests to match the new parameterized mutation logic.
  • Added coverage for an authoritative uncapped memory case.

Issue Reference / Bot Note

  • The PR references DPTP-2938.
  • A CI bot comment notes the Jira issue lacks a configured target version, while the expected target version for the main branch is 5.0.0.

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 16, 2026
@openshift-ci-robot

openshift-ci-robot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

@deepsm007: This pull request references DPTP-2938 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Add configurable per-admission CPU/memory reduction caps for authoritative pod-scaler mode, defaulting to no cap in code with 25% set via release deployment args.

/cc @smg247

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown
📝 Walkthrough

Walkthrough

Replaces a single hardcoded authoritativeMaxReductionPercent constant with separate CLI-configurable CPU and memory reduction fraction parameters. These flow from new options struct fields and --authoritative-cpu/memory-max-reduction-percent flags through mainAdmissionadmitmutatePodResourcesuseOursIfLarger, where the capping condition now selects the per-resource limit and logs a boolean reduction_capped.

Changes

Per-resource authoritative max reduction percent

Layer / File(s) Summary
CLI flags, options, and mainAdmission wiring
cmd/pod-scaler/main.go
options struct adds authoritativeCPUMaxReductionPercent and authoritativeMemoryMaxReductionPercent; bindOptions registers both as CLI flags defaulting to 1.0; validate() enforces [0, 1] range; mainAdmission passes both to admit().
Admission webhook function signatures and parameter threading
cmd/pod-scaler/admission.go
admit() signature accepts new CPU and memory max reduction parameters; podMutator struct stores both and /pods handler receives them; mutatePodResources() signature expands to include both; Handle and mutatePodResources call sites updated to pass parameters through.
Per-resource capping condition and dry-run logging
cmd/pod-scaler/admission.go
Removes single authoritativeMaxReductionPercent constant and introduces authoritativeMinCPURequest; useOursIfLarger gains separate CPU/memory max reduction params; selects correct limit per field, rewrites capping condition to compare against it, and switches dry-run logging to boolean reduction_capped.
Test call site updates and new uncapped memory test
cmd/pod-scaler/admission_test.go
All existing useOursIfLarger and mutatePodResources test call sites updated to pass two additional float arguments; new TestUseOursIfLarger_authoritativeUncappedMemory verifies memory reduction is not capped when memory max reduction fraction is 1.0.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 16 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 13.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (16 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: adding configurable CPU and memory authoritative max reduction flags to pod-scaler, matching the substantial changes across admission.go, admission_test.go, and main.go.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Go Error Handling ✅ Passed The PR follows proper Go error handling: no ignored errors, errors are wrapped with %w where needed, no panic() usage, and nil pointers are checked before dereferencing.
Test Coverage For New Features ✅ Passed New feature includes test coverage for core capping logic: TestUseOursIfLarger_authoritativeUncappedMemory validates asymmetric per-resource reduction caps, and existing tests verify capped/uncappe...
Stable And Deterministic Test Names ✅ Passed The PR uses Go's standard testing framework (not Ginkgo). All test function names are static and descriptive: TestUseOursIfLarger_authoritativeUncappedMemory, TestMutatePodResources, etc. No dynami...
Test Structure And Quality ✅ Passed The admission_test.go file uses standard Go testing (_test.go with func Test(t *testing.T)), not Ginkgo framework. The custom check is inapplicable to this codebase.
Microshift Test Compatibility ✅ Passed PR adds no new Ginkgo e2e tests; only updates unit tests in admission_test.go using standard Go testing framework.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No Ginkgo e2e tests were added in this PR. Changes are limited to unit/integration tests in cmd/pod-scaler/, which use standard Go testing package, not Ginkgo.
Topology-Aware Scheduling Compatibility ✅ Passed PR modifies only resource sizing/reduction logic via new CPU/memory max-reduction parameters. No scheduling constraints, affinity rules, deployment manifests, or topology-dependent logic were intro...
Ote Binary Stdout Contract ✅ Passed pod-scaler is not an OTE test binary; it's a Kubernetes admission webhook service. The OTE Binary Stdout Contract check does not apply to this PR.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed No Ginkgo e2e tests were added. Changes are standard Go unit tests for pod-scaler admission webhook logic in cmd/pod-scaler/. Check is not applicable.
No-Weak-Crypto ✅ Passed Pull request modifies pod-scaler resource mutation logic with configurable CPU/memory reduction caps; no cryptographic operations, weak algorithms, custom crypto, or insecure comparisons present.
Container-Privileges ✅ Passed PR modifies only Go source code (admission.go, admission_test.go, main.go) for pod-scaler admission webhook logic. No Kubernetes manifests, Dockerfiles, or container security configurations are mod...
No-Sensitive-Data-In-Logs ✅ Passed The PR logs pod resource quantities, workload names, and a new reduction_capped boolean flag. No passwords, tokens, API keys, PII, session IDs, internal hostnames, or customer data are logged.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci

openshift-ci Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deepsm007

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 16, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
cmd/pod-scaler/admission.go (1)

40-40: 🏗️ Heavy lift

Reduce parameter plumbing by grouping authoritative settings into a config struct.

Line 40, Line 236, and Line 414 continue expanding already-large signatures. This is getting brittle for call-site ordering and future extension; bundle these related knobs into one typed config and pass that through.

As per coding guidelines, “Keep function signatures small — if a function takes more than 3-4 parameters, consider grouping them into an options struct.”

Also applies to: 236-236, 414-414

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/pod-scaler/admission.go` at line 40, The admit function signature on line
40 (and its call sites at lines 236 and 414) has excessive parameters,
particularly the authoritative-related settings (authoritativeCPU,
authoritativeMemory, authoritativeCPUDryRun, authoritativeMemoryDryRun,
authoritativeCPUMaxReductionPercent, authoritativeMemoryMaxReductionPercent).
Create a new config struct to bundle these related authoritative settings
together, then refactor the admit function to accept this struct instead of
individual parameters. Update all call sites at lines 236 and 414 to instantiate
and pass this config struct instead of passing the individual authoritative
parameters separately.

Source: Coding guidelines

cmd/pod-scaler/main.go (1)

111-112: Ensure deployment manifests set the intended 25% cap.

Line 111 and Line 112 default both reductions to 1.0 (no cap). The linked openshift/release admission deployment currently does not set these new flags, so behavior stays uncapped until that follow-up manifest update lands.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/pod-scaler/main.go` around lines 111 - 112, The flags
authoritativeCPUMaxReductionPercent and authoritativeMemoryMaxReductionPercent
currently default to 1.0 (no cap) in the code. To enforce the intended 25%
reduction cap, update the deployment manifests in the openshift/release
repository to explicitly set the command-line flags
--authoritative-cpu-max-reduction-percent and
--authoritative-memory-max-reduction-percent to 0.25 when deploying this
pod-scaler service, rather than relying on the uncapped defaults.

Source: Linked repositories

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@cmd/pod-scaler/admission.go`:
- Line 40: The admit function signature on line 40 (and its call sites at lines
236 and 414) has excessive parameters, particularly the authoritative-related
settings (authoritativeCPU, authoritativeMemory, authoritativeCPUDryRun,
authoritativeMemoryDryRun, authoritativeCPUMaxReductionPercent,
authoritativeMemoryMaxReductionPercent). Create a new config struct to bundle
these related authoritative settings together, then refactor the admit function
to accept this struct instead of individual parameters. Update all call sites at
lines 236 and 414 to instantiate and pass this config struct instead of passing
the individual authoritative parameters separately.

In `@cmd/pod-scaler/main.go`:
- Around line 111-112: The flags authoritativeCPUMaxReductionPercent and
authoritativeMemoryMaxReductionPercent currently default to 1.0 (no cap) in the
code. To enforce the intended 25% reduction cap, update the deployment manifests
in the openshift/release repository to explicitly set the command-line flags
--authoritative-cpu-max-reduction-percent and
--authoritative-memory-max-reduction-percent to 0.25 when deploying this
pod-scaler service, rather than relying on the uncapped defaults.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 52112767-757e-4bce-910c-53e3960cd03f

📥 Commits

Reviewing files that changed from the base of the PR and between 98989ed and 2d00cf3.

📒 Files selected for processing (3)
  • cmd/pod-scaler/admission.go
  • cmd/pod-scaler/admission_test.go
  • cmd/pod-scaler/main.go
🔗 Linked repositories identified

CodeRabbit considers these linked repositories for cross-repo context during reviews:

  • openshift/release (manual)
  • openshift/ci-docs (manual)
  • openshift/release-controller (manual)
  • openshift/ci-chat-bot (manual)

@deepsm007

Copy link
Copy Markdown
Contributor Author

/test images e2e

@deepsm007 deepsm007 force-pushed the feat/authoritative-max-reduction-percent branch from 2d00cf3 to 90280b3 Compare June 16, 2026 15:32
@deepsm007

Copy link
Copy Markdown
Contributor Author

/retest

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Scheduling tests matching the pipeline_run_if_changed or not excluded by pipeline_skip_if_only_changed parameters:
/test e2e

@deepsm007

Copy link
Copy Markdown
Contributor Author

/override ci/prow/e2e

@openshift-ci

openshift-ci Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

@deepsm007: Overrode contexts on behalf of deepsm007: ci/prow/e2e

Details

In response to this:

/override ci/prow/e2e

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci

openshift-ci Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

@deepsm007: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants