Skip to Content
Core InfraWorkload RightsizingPolicies

Policies

ScaleOps comes with several out-of-the-box production-ready policies that cover the vast majority of use cases.

ScaleOps ensures stability by not evicting certain pods under the following conditions:

  • Pod Annotations: Pods annotated with cluster-autoscaler.kubernetes.io/safe-to-evict: "false", karpenter.sh/do-not-evict: "true", or karpenter.sh/do-not-disrupt: "true" are exempt from eviction
  • Pod Disruption Budgets (PDBs): Before attempting eviction, ScaleOps respects currently applied Pod Disruption Budgets, confirming that at least one pod can be safely removed as per PDB allowances

Ready-to-Deploy Policies

ScaleOps learns and understands your workload’s behavior and automatically applies the best policy for your workload.

You can also manually select a policy for your workload. The following policies are created by default:

The following are only created if you have a corresponding workload type:

Note: The policies are created in the scaleops-system namespace.


Production

This policy allocates additional capacity to accommodate potential activity spikes and minimize disruptions to ensure consistent workload performance. The workload will be dynamically adjusted to handle unexpected increases in activity.

High-Availability

For workloads that require extra availability guarantees. This policy sizes workloads with additional resources based on a longer history window to account for unexpected spikes in activity.

Recommended for: Infrastructure components such as Kafka, RabbitMQ, databases, etc.

Cost

A cost-effective and dynamic policy suited for non-production environments and workloads.

Batch

For workloads that require a policy with a longer history window to gather detailed insights, including for short-lived and scheduled workloads. Recommendations are based on extended history windows, and optimizations are applied during pod creation.

Recommended for: Batch workloads like jobs, CronJobs, GitLab Runners and other scheduled workloads.

System

For system workloads that require additional availability guarantees, similar to the high-availability policy. It’s important to avoid any workload disruptions. Recommendations are based on an extended history window and high-percentile data. These recommendations are applied only after sufficient data coverage is achieved, and optimizations are not applied during the initial automation phase.

Recommended for: kube-system workloads

Weekly-Optimization

For workloads that require optimization for two different time periods, weekdays and weekends. This policy optimizes them based on historical data from the past week, using separate samples for weekdays and weekends.

Recommended for: Workloads with different usage patterns during weekdays and weekends

Daemonsets

For daemonsets that have large amount of replicas and vary in size.

Recommended for: DaemonSet workloads

Java

A policy optimized for Java workloads that accounts for JVM overhead and startup requirements. ScaleOps automatically detects Java workloads and assigns this policy.

Recommended for: Java applications, Spring Boot services, microservices running on JVM

Prometheus

A policy for Prometheus workloads that require stable performance. Recommendations are based on high-percentile usage (p97) over an extended history window (96h). The policy applies moderate headroom (20%) and defined minimum resource buffers (100m CPU, 500Mi memory) to maintain reliability while avoiding over-provisioning.

Recommended for: Prometheus, Thanos, and other monitoring stack workloads

Policy Deep-Dive

For more advanced use cases, you can customize the policies.

Request Configuration

Policy Request

History Window

This parameter defines the length of the historical window used for analyzing usage data.

DurationUse Case
12 hoursFor cost efficiency
1 day - 2 daysFor stable recommendations (recommended)
4 days - 7 daysFor very stable recommendations

Request Headroom

The headroom parameter represents the percentage of resources to keep in reserve for your workload.

PercentageUse Case
0% - 5%For cost efficiency
5% - 10%For production workloads (recommended)
10%+For extra capacity

Histogram Request Percentile

The percentile setting determines how closely our platform should track the resource usage of the optimized workload.

PercentileUse Case
80% - 90%For cost efficiency
90% - 95%For stable and cost-effective recommendations
95% - 100%For critical and burstable workloads that need extra capacity

Burst Reaction

ScaleOps optimization analyzes historical data and acts immediately upon rapid bursts that require an update in resources when needed.

OptionRecommendation
Enabled(recommended)
Disabled

Auto Healing

ScaleOps will automatically heal and recover pods suffering from CPU stress, out-of-memory issues, and pods with liveness probe errors due to lack of resources.

OptionRecommendation
Enabled(recommended)
Disabled

Minimum Resource Request Boundaries

Determine the minimum CPU and memory boundaries for requests.

ResourceRecommended Value
CPU0.01
Memory (GiB)0.02

Maximum Resource Request Boundaries

Determine the maximum CPU and memory boundaries for a workload.

ResourceValue
CPU0.0
Memory (GiB)0.0

Set maximum according to max node size: ScaleOps will automatically set the maximum resource boundaries while considering the node capacity.

OptionRecommendation
Enabled(recommended)
Disabled

Keep Original Requests

Disable ScaleOps from changing the original resource requests.

OptionRecommendation
Enabled
Disabled(recommended)

Integer CPU

Enforce CPU recommendations to be integer values by rounding up after calculating based on workload demand and trends.

OptionRecommendation
Enabled
Disabled(recommended)

Memory replicas percentile

Recommend memory requests based on a user-defined usage percentile aggregated from all replicas. 100% (default) will use all replicas to determine memory usage.

Init Containers Optimization

Enable or disable automatic optimization of init container resource requests. Learn more about Init Containers Optimization.

OptionRecommendation
Enabled(recommended)
Disabled

Ephemeral Storage

The Ephemeral Storage sub-section appears under Request when ephemeral storage optimization is enabled. Learn more about Ephemeral Storage Optimization.

Ephemeral Storage Policy

Specific ephemeral storage configuration:

SettingDescriptionDefault
Optimization ToggleEnable or disable ephemeral storage optimization for workloads in this policyEnabled
Auto-healingAutomatically adjusts recommendations when pods are evicted due to disk pressure, preventing repeated evictionsEnabled

Note: The “Max by node” toggle in the General section also applies to ephemeral storage, capping recommendations to the node’s allocatable ephemeral storage.

Limit Configuration

Policy Limit

Limit Strategy

StrategyDescription
Keep original limit (Recommended)Retains the workload’s initial CPU, memory, or ephemeral storage limit
Set no limitRemoves resource limit. Recommended if CFS quota is disabled cluster-wide
Set limit as requestAligns the resource limit with the recommended resource request, ensuring that the pod’s limit and request are set to the same value
Set LimitManually define a specific CPU or memory limit, overriding the original limit
Keep limit-request ratioMaintains the original ratio between limit and request, to preserve the proportional balance originally set
Set limit-request ratioAllows setting a custom ratio between limit and request

When ephemeral storage optimization is enabled, the Limit section includes an ephemeral storage limit strategy (keep original limit, set a limit-to-request ratio, etc.), configurable like CPU and memory.

Limit-Request Ratio

When selecting Keep limit-request ratio or Set limit-request ratio strategies, the system will adjust the limit according to the recommended request. In both cases, the limit will not be set lower than the original user-defined limit, if one is specified.

  • Keep limit-request ratio: Preserves the original limit/request ratio defined before automation (e.g., for original request 1000m and limit 1500m, if the recommended request is 500m, the limit will be set to 750m). If no request or limit was initially specified, the system defaults to the “Keep Limit” policy, as the original ratio is undefined.

  • Set limit-request ratio: Sets the limit as a multiplier of the recommended request according to the specified ratio (e.g., for recommended request of 500m and ratio of 2, the limit will be set to 1000m).

Dynamic Limit Caps (Optional Boundaries) Available in v1.29.14+

When using Keep limit-request ratio or Set limit-request ratio, optional hard bounds can be configured to prevent unlimited or runaway limit changes over time. These settings are available per resource (CPU, memory, and ephemeral storage).

SettingDescription
Minimum limitThe limit will never be set below this value, regardless of rightsizing. The limit will also never decrease below the original limit.
Maximum limitThe limit will never exceed this absolute value, regardless of resource pressure.
Max Limit Increase FactorCaps the limit relative to the original limit. For example, a factor of 3 with an original limit of 2 GiB means the limit will never exceed 6 GiB.

When both a Maximum limit and a Max Limit Increase Factor are configured, the smaller of the two resulting values becomes the effective upper bound.

Example:

Value
Original memory limit2 GiB
Set limit-to-request ratio
Max Limit Increase Factor3
Maximum limit2.5 GiB
Upper bound from factor3 × 2 GiB = 6 GiB
Effective upper bound2.5 GiB (smaller of 6 GiB and 2.5 GiB)

Because the absolute maximum limit (2.5 GiB) is smaller than the factor-derived ceiling (6 GiB), the absolute maximum limit is the effective boundary. The limit will never exceed 2.5 GiB, regardless of continued resource pressure.

Automation Configuration

Policy Automation

Automation Optimization Strategy

Define your optimization strategy per workload type.

StrategyDescription
OngoingScaleOps continuously updates resource requests to ensure the pods get the right amount of resources at all times (recommended)
Upon pod creationScaleOps updates container resources on new pod creation and never changes them later

Recommended strategies by workload type:

Workload TypeRecommended Strategy
DeploymentOngoing
StatefulSetUpon pod creation
DaemonSetUpon pod creation
DeploymentConfigOngoing
Custom WorkloadUpon pod creation

In-Place Optimization Available in v1.17.0+

Optimize your workloads with no disruption to ensure high availability at all times. View in-place documentation

  • Enable for ongoing automation strategy
  • Enable for upon pod creation automation strategy

Note: ScaleOps always optimizes workloads based on the defined automation strategy and configuration. If in-place optimization is not feasible, optimization will fall back to the defined automation strategy for each workload type.

Java Memory Optimization (Legacy)

ℹ️

Java optimization automation is no longer controlled via policy settings. Use the dedicated Java automation controls (UI toggle, UI actions, or GitOps) instead. See Java Optimization for details.

This legacy setting enables JVM-aware memory optimization for Java workloads. For new deployments, use the dedicated Java automation controls described in the Java Optimization documentation.

OptionRecommendation
EnabledLegacy - use Java automation controls instead
Disabled(default)

Zero Downtime Rollout Strategies

Optimize workloads by first creating new optimized pods to ensure high availability with no-downtime, while respecting workload’s availability.

OptionRecommendation
Workloads with a single replicaEnabled (recommended)
Workloads with multiple replicasDisabled

Ensure High Availability

ScaleOps optimization will consider Workload’s rollout strategy, PDBs, unevictable annotations, and will ensure minimum of 1 replica at all time.

OptionRecommendation
Ensure minimum of 1 replicaEnabled (recommended)
Respect unevictable pods by annotationEnabled (recommended)

Readiness period buffer

Define a time buffer for pods to become ready, in addition to the existing workload readiness probe.

OptionRecommendation
5 seconds(recommended)

Update HPA Resource-Based Triggers

Update HPA CPU and Memory utilization based triggers according to the new container requests size.

OptionRecommendation
Enabled(recommended)
Disabled

Actively enforce optimization according to context

Actively enforce optimization according to context of node stress, noisy neighbors and application context.

Optimization Upon Automation

Allow ScaleOps to rollout upon workload automation.

OptionRecommendation
Enabled(recommended)
Disabled

Allowed Rollout Period

Determine when ScaleOps allows to optimize workloads.

Default: Days: S-M-T-W-T-F-S From: 00:00 to 23:59

Required Window Coverage

Define the percentage of data points in history window that is required for applying changes.

Note: low coverage will delay ScaleOps from applying optimizations.

PercentageRecommendation
2%(recommended)

Range: 0% - 100%

Scheduling Configuration

Bin-Pack Unevictable Pods

Automatically bin-pack unevictable pods to allow nodes to be more consolidated, further reducing costs.

OptionRecommendation
Enabled(recommended)
Disabled