Policies

ScaleOps comes with several out-of-the-box production-ready policies that cover the vast majority of use cases.

ScaleOps ensures stability by not evicting certain pods under the following conditions:

Pod Annotations: Pods annotated with cluster-autoscaler.kubernetes.io/safe-to-evict: "false", karpenter.sh/do-not-evict: "true", or karpenter.sh/do-not-disrupt: "true" are exempt from eviction
Pod Disruption Budgets (PDBs): Before attempting eviction, ScaleOps respects currently applied Pod Disruption Budgets, confirming that at least one pod can be safely removed as per PDB allowances

Ready-to-Deploy Policies

ScaleOps learns and understands your workload’s behavior and automatically applies the best policy for your workload.

You can also manually select a policy for your workload. The following policies are created by default:

Production
High-Availability
Cost
Batch
Daemonsets
Weekly-Optimization

The following are only created if you have a corresponding workload type:

System
Java
Spark
Flink
High-Replica
Prometheus

Note: The policies are created in the scaleops-system namespace.

Production

This policy allocates additional capacity to accommodate potential activity spikes and minimize disruptions to ensure consistent workload performance. The workload will be dynamically adjusted to handle unexpected increases in activity.

High-Availability

For workloads that require extra availability guarantees. This policy sizes workloads with additional resources based on a longer history window to account for unexpected spikes in activity.

Recommended for: Infrastructure components such as Kafka, RabbitMQ, databases, etc.

Cost

A cost-effective and dynamic policy suited for non-production environments and workloads.

Batch

For workloads that require a policy with a longer history window to gather detailed insights, including for short-lived and scheduled workloads. Recommendations are based on extended history windows, and optimizations are applied during pod creation.

Recommended for: Batch workloads like jobs, CronJobs, GitLab Runners and other scheduled workloads.

System

For system workloads that require additional availability guarantees, similar to the high-availability policy. It’s important to avoid any workload disruptions. Recommendations are based on an extended history window and high-percentile data. These recommendations are applied only after sufficient data coverage is achieved, and optimizations are not applied during the initial automation phase.

Recommended for: kube-system workloads

Weekly-Optimization

For workloads that require optimization for two different time periods, weekdays and weekends. This policy optimizes them based on historical data from the past week, using separate samples for weekdays and weekends.

Recommended for: Workloads with different usage patterns during weekdays and weekends

Daemonsets

For daemonsets that have large amount of replicas and vary in size.

Recommended for: DaemonSet workloads

Java

A policy optimized for Java workloads that accounts for JVM overhead and startup requirements. ScaleOps automatically detects Java workloads and assigns this policy.

Recommended for: Java applications, Spring Boot services, microservices running on JVM

Prometheus

A policy for Prometheus workloads that require stable performance. Recommendations are based on high-percentile usage (p97) over an extended history window (96h). The policy applies moderate headroom (20%) and defined minimum resource buffers (100m CPU, 500Mi memory) to maintain reliability while avoiding over-provisioning.

Recommended for: Prometheus, Thanos, and other monitoring stack workloads

Policy Deep-Dive

For more advanced use cases, you can customize the policies.

Request Configuration

Policy Request

History Window

This parameter defines the length of the historical window used for analyzing usage data.

Duration	Use Case
`12 hours`	For cost efficiency
`1 day` - `2 days`	For stable recommendations (recommended)
`4 days` - `7 days`	For very stable recommendations

Request Headroom

The headroom parameter represents the percentage of resources to keep in reserve for your workload.

Percentage	Use Case
`0%` - `5%`	For cost efficiency
`5%` - `10%`	For production workloads (recommended)
`10%+`	For extra capacity

Histogram Request Percentile

The percentile setting determines how closely our platform should track the resource usage of the optimized workload.

Percentile	Use Case
`80%` - `90%`	For cost efficiency
`90%` - `95%`	For stable and cost-effective recommendations
`95%` - `100%`	For critical and burstable workloads that need extra capacity

Burst Reaction

ScaleOps optimization analyzes historical data and acts immediately upon rapid bursts that require an update in resources when needed.

Option	Recommendation
`Enabled`	(recommended)
`Disabled`

Auto Healing

ScaleOps will automatically heal and recover pods suffering from CPU stress, out-of-memory issues, and pods with liveness probe errors due to lack of resources.

Option	Recommendation
`Enabled`	(recommended)
`Disabled`

Minimum Resource Request Boundaries

Determine the minimum CPU and memory boundaries for requests.

Resource	Recommended Value
CPU	`0.01`
Memory (GiB)	`0.02`

Maximum Resource Request Boundaries

Determine the maximum CPU and memory boundaries for a workload.

Resource	Value
CPU	`0.0`
Memory (GiB)	`0.0`

Set maximum according to max node size: ScaleOps will automatically set the maximum resource boundaries while considering the node capacity.

Option	Recommendation
`Enabled`	(recommended)
`Disabled`

Keep Original Requests

Disable ScaleOps from changing the original resource requests.

Option	Recommendation
`Enabled`
`Disabled`	(recommended)

Integer CPU

Enforce CPU recommendations to be integer values by rounding up after calculating based on workload demand and trends.

Option	Recommendation
`Enabled`
`Disabled`	(recommended)

Memory replicas percentile

Recommend memory requests based on a user-defined usage percentile aggregated from all replicas. 100% (default) will use all replicas to determine memory usage.

Init Containers Optimization

Enable or disable automatic optimization of init container resource requests. Learn more about Init Containers Optimization.

Option	Recommendation
`Enabled`	(recommended)
`Disabled`

Ephemeral Storage

The Ephemeral Storage sub-section appears under Request when ephemeral storage optimization is enabled. Learn more about Ephemeral Storage Optimization.

Ephemeral Storage Policy

Specific ephemeral storage configuration:

Setting	Description	Default
Optimization Toggle	Enable or disable ephemeral storage optimization for workloads in this policy	Enabled
Auto-healing	Automatically adjusts recommendations when pods are evicted due to disk pressure, preventing repeated evictions	Enabled

Note: The “Max by node” toggle in the General section also applies to ephemeral storage, capping recommendations to the node’s allocatable ephemeral storage.

Limit Configuration

Policy Limit

Limit Strategy

Strategy	Description
Keep original limit (Recommended)	Retains the workload’s initial CPU, memory, or ephemeral storage limit
Set no limit	Removes resource limit. Recommended if CFS quota is disabled cluster-wide
Set limit as request	Aligns the resource limit with the recommended resource request, ensuring that the pod’s limit and request are set to the same value
Set Limit	Manually define a specific CPU or memory limit, overriding the original limit
Keep limit-request ratio	Maintains the original ratio between limit and request, to preserve the proportional balance originally set
Set limit-request ratio	Allows setting a custom ratio between limit and request

When ephemeral storage optimization is enabled, the Limit section includes an ephemeral storage limit strategy (keep original limit, set a limit-to-request ratio, etc.), configurable like CPU and memory.

Limit-Request Ratio

When selecting Keep limit-request ratio or Set limit-request ratio strategies, the system will adjust the limit according to the recommended request. In both cases, the limit will not be set lower than the original user-defined limit, if one is specified.

Keep limit-request ratio: Preserves the original limit/request ratio defined before automation (e.g., for original request 1000m and limit 1500m, if the recommended request is 500m, the limit will be set to 750m). If no request or limit was initially specified, the system defaults to the “Keep Limit” policy, as the original ratio is undefined.
Set limit-request ratio: Sets the limit as a multiplier of the recommended request according to the specified ratio (e.g., for recommended request of 500m and ratio of 2, the limit will be set to 1000m).

Dynamic Limit Caps (Optional Boundaries) Available in v1.29.14+

When using Keep limit-request ratio or Set limit-request ratio, optional hard bounds can be configured to prevent unlimited or runaway limit changes over time. These settings are available per resource (CPU, memory, and ephemeral storage).

Setting	Description
Minimum limit	The limit will never be set below this value, regardless of rightsizing. The limit will also never decrease below the original limit.
Maximum limit	The limit will never exceed this absolute value, regardless of resource pressure.
Max Limit Increase Factor	Caps the limit relative to the original limit. For example, a factor of 3 with an original limit of 2 GiB means the limit will never exceed 6 GiB.

When both a Maximum limit and a Max Limit Increase Factor are configured, the smaller of the two resulting values becomes the effective upper bound.

Example:

	Value
Original memory limit	2 GiB
Set limit-to-request ratio	2×
Max Limit Increase Factor	3
Maximum limit	2.5 GiB
Upper bound from factor	3 × 2 GiB = 6 GiB
Effective upper bound	2.5 GiB (smaller of 6 GiB and 2.5 GiB)

Because the absolute maximum limit (2.5 GiB) is smaller than the factor-derived ceiling (6 GiB), the absolute maximum limit is the effective boundary. The limit will never exceed 2.5 GiB, regardless of continued resource pressure.

Automation Configuration

Policy Automation

Automation Optimization Strategy

Define your optimization strategy per workload type.

Strategy	Description
Ongoing	ScaleOps continuously updates resource requests to ensure the pods get the right amount of resources at all times (recommended)
Upon pod creation	ScaleOps updates container resources on new pod creation and never changes them later

Recommended strategies by workload type:

Workload Type	Recommended Strategy
Deployment	`Ongoing`
StatefulSet	`Upon pod creation`
DaemonSet	`Upon pod creation`
DeploymentConfig	`Ongoing`
Custom Workload	`Upon pod creation`

In-Place Optimization Available in v1.17.0+

Optimize your workloads with no disruption to ensure high availability at all times. View in-place documentation

Enable for ongoing automation strategy
Enable for upon pod creation automation strategy

Note: ScaleOps always optimizes workloads based on the defined automation strategy and configuration. If in-place optimization is not feasible, optimization will fall back to the defined automation strategy for each workload type.

Java Memory Optimization (Legacy)

ℹ️

Java optimization automation is no longer controlled via policy settings. Use the dedicated Java automation controls (UI toggle, UI actions, or GitOps) instead. See Java Optimization for details.

This legacy setting enables JVM-aware memory optimization for Java workloads. For new deployments, use the dedicated Java automation controls described in the Java Optimization documentation.

Option	Recommendation
`Enabled`	Legacy - use Java automation controls instead
`Disabled`	(default)

Zero Downtime Rollout Strategies

Optimize workloads by first creating new optimized pods to ensure high availability with no-downtime, while respecting workload’s availability.

Option	Recommendation
`Workloads with a single replica`	`Enabled` (recommended)
`Workloads with multiple replicas`	`Disabled`

Ensure High Availability

ScaleOps optimization will consider Workload’s rollout strategy, PDBs, unevictable annotations, and will ensure minimum of 1 replica at all time.

Option	Recommendation
`Ensure minimum of 1 replica`	`Enabled` (recommended)
`Respect unevictable pods by annotation`	`Enabled` (recommended)

Readiness period buffer

Define a time buffer for pods to become ready, in addition to the existing workload readiness probe.

Option	Recommendation
`5 seconds`	(recommended)

Update HPA Resource-Based Triggers

Update HPA CPU and Memory utilization based triggers according to the new container requests size.

Option	Recommendation
`Enabled`	(recommended)
`Disabled`

Actively enforce optimization according to context

Actively enforce optimization according to context of node stress, noisy neighbors and application context.

Optimization Upon Automation

Allow ScaleOps to rollout upon workload automation.

Option	Recommendation
`Enabled`	(recommended)
`Disabled`

Allowed Rollout Period

Determine when ScaleOps allows to optimize workloads.

Default: Days: S-M-T-W-T-F-S From: 00:00 to 23:59

Required Window Coverage

Define the percentage of data points in history window that is required for applying changes.

Note: low coverage will delay ScaleOps from applying optimizations.

Percentage	Recommendation
`2%`	(recommended)

Range: 0% - 100%

Scheduling Configuration

Bin-Pack Unevictable Pods

Automatically bin-pack unevictable pods to allow nodes to be more consolidated, further reducing costs.

Option	Recommendation
`Enabled`	(recommended)
`Disabled`