Policies
ScaleOps comes with several out-of-the-box production-ready policies that cover the vast majority of use cases.
ScaleOps ensures stability by not evicting certain pods under the following conditions:
- Pod Annotations: Pods annotated with
cluster-autoscaler.kubernetes.io/safe-to-evict: "false",karpenter.sh/do-not-evict: "true", orkarpenter.sh/do-not-disrupt: "true"are exempt from eviction - Pod Disruption Budgets (PDBs): Before attempting eviction, ScaleOps respects currently applied Pod Disruption Budgets, confirming that at least one pod can be safely removed as per PDB allowances
Ready-to-Deploy Policies
ScaleOps learns and understands your workload’s behavior and automatically applies the best policy for your workload.
You can also manually select a policy for your workload. The following policies are created by default:
The following are only created if you have a corresponding workload type:
- System
- Java
- Spark
- Flink
- High-Replica
- Prometheus
Note: The policies are created in the scaleops-system namespace.
Production
This policy allocates additional capacity to accommodate potential activity spikes and minimize disruptions to ensure consistent workload performance. The workload will be dynamically adjusted to handle unexpected increases in activity.
High-Availability
For workloads that require extra availability guarantees. This policy sizes workloads with additional resources based on a longer history window to account for unexpected spikes in activity.
Recommended for: Infrastructure components such as Kafka, RabbitMQ, databases, etc.
Cost
A cost-effective and dynamic policy suited for non-production environments and workloads.
Batch
For workloads that require a policy with a longer history window to gather detailed insights, including for short-lived and scheduled workloads. Recommendations are based on extended history windows, and optimizations are applied during pod creation.
Recommended for: Batch workloads like jobs, CronJobs, GitLab Runners and other scheduled workloads.
System
For system workloads that require additional availability guarantees, similar to the high-availability policy. It’s important to avoid any workload disruptions. Recommendations are based on an extended history window and high-percentile data. These recommendations are applied only after sufficient data coverage is achieved, and optimizations are not applied during the initial automation phase.
Recommended for: kube-system workloads
Weekly-Optimization
For workloads that require optimization for two different time periods, weekdays and weekends. This policy optimizes them based on historical data from the past week, using separate samples for weekdays and weekends.
Recommended for: Workloads with different usage patterns during weekdays and weekends
Daemonsets
For daemonsets that have large amount of replicas and vary in size.
Recommended for: DaemonSet workloads
Java
A policy optimized for Java workloads that accounts for JVM overhead and startup requirements. ScaleOps automatically detects Java workloads and assigns this policy.
Recommended for: Java applications, Spring Boot services, microservices running on JVM
Prometheus
A policy for Prometheus workloads that require stable performance. Recommendations are based on high-percentile usage (p97) over an extended history window (96h). The policy applies moderate headroom (20%) and defined minimum resource buffers (100m CPU, 500Mi memory) to maintain reliability while avoiding over-provisioning.
Recommended for: Prometheus, Thanos, and other monitoring stack workloads
Policy Deep-Dive
For more advanced use cases, you can customize the policies.
Request Configuration

History Window
This parameter defines the length of the historical window used for analyzing usage data.
| Duration | Use Case |
|---|---|
12 hours | For cost efficiency |
1 day - 2 days | For stable recommendations (recommended) |
4 days - 7 days | For very stable recommendations |
Request Headroom
The headroom parameter represents the percentage of resources to keep in reserve for your workload.
| Percentage | Use Case |
|---|---|
0% - 5% | For cost efficiency |
5% - 10% | For production workloads (recommended) |
10%+ | For extra capacity |
Histogram Request Percentile
The percentile setting determines how closely our platform should track the resource usage of the optimized workload.
| Percentile | Use Case |
|---|---|
80% - 90% | For cost efficiency |
90% - 95% | For stable and cost-effective recommendations |
95% - 100% | For critical and burstable workloads that need extra capacity |
Burst Reaction
ScaleOps optimization analyzes historical data and acts immediately upon rapid bursts that require an update in resources when needed.
| Option | Recommendation |
|---|---|
Enabled | (recommended) |
Disabled |
Auto Healing
ScaleOps will automatically heal and recover pods suffering from CPU stress, out-of-memory issues, and pods with liveness probe errors due to lack of resources.
| Option | Recommendation |
|---|---|
Enabled | (recommended) |
Disabled |
Minimum Resource Request Boundaries
Determine the minimum CPU and memory boundaries for requests.
| Resource | Recommended Value |
|---|---|
| CPU | 0.01 |
| Memory (GiB) | 0.02 |
Maximum Resource Request Boundaries
Determine the maximum CPU and memory boundaries for a workload.
| Resource | Value |
|---|---|
| CPU | 0.0 |
| Memory (GiB) | 0.0 |
Set maximum according to max node size: ScaleOps will automatically set the maximum resource boundaries while considering the node capacity.
| Option | Recommendation |
|---|---|
Enabled | (recommended) |
Disabled |
Keep Original Requests
Disable ScaleOps from changing the original resource requests.
| Option | Recommendation |
|---|---|
Enabled | |
Disabled | (recommended) |
Integer CPU
Enforce CPU recommendations to be integer values by rounding up after calculating based on workload demand and trends.
| Option | Recommendation |
|---|---|
Enabled | |
Disabled | (recommended) |
Memory replicas percentile
Recommend memory requests based on a user-defined usage percentile aggregated from all replicas. 100% (default) will use all replicas to determine memory usage.
Init Containers Optimization
Enable or disable automatic optimization of init container resource requests. Learn more about Init Containers Optimization.
| Option | Recommendation |
|---|---|
Enabled | (recommended) |
Disabled |
Ephemeral Storage
The Ephemeral Storage sub-section appears under Request when ephemeral storage optimization is enabled. Learn more about Ephemeral Storage Optimization.

Specific ephemeral storage configuration:
| Setting | Description | Default |
|---|---|---|
| Optimization Toggle | Enable or disable ephemeral storage optimization for workloads in this policy | Enabled |
| Auto-healing | Automatically adjusts recommendations when pods are evicted due to disk pressure, preventing repeated evictions | Enabled |
Note: The “Max by node” toggle in the General section also applies to ephemeral storage, capping recommendations to the node’s allocatable ephemeral storage.
Limit Configuration

Limit Strategy
| Strategy | Description |
|---|---|
| Keep original limit (Recommended) | Retains the workload’s initial CPU, memory, or ephemeral storage limit |
| Set no limit | Removes resource limit. Recommended if CFS quota is disabled cluster-wide |
| Set limit as request | Aligns the resource limit with the recommended resource request, ensuring that the pod’s limit and request are set to the same value |
| Set Limit | Manually define a specific CPU or memory limit, overriding the original limit |
| Keep limit-request ratio | Maintains the original ratio between limit and request, to preserve the proportional balance originally set |
| Set limit-request ratio | Allows setting a custom ratio between limit and request |
When ephemeral storage optimization is enabled, the Limit section includes an ephemeral storage limit strategy (keep original limit, set a limit-to-request ratio, etc.), configurable like CPU and memory.
Limit-Request Ratio
When selecting Keep limit-request ratio or Set limit-request ratio strategies, the system will adjust the limit according to the recommended request. In both cases, the limit will not be set lower than the original user-defined limit, if one is specified.
-
Keep limit-request ratio: Preserves the original limit/request ratio defined before automation (e.g., for original request 1000m and limit 1500m, if the recommended request is 500m, the limit will be set to 750m). If no request or limit was initially specified, the system defaults to the “Keep Limit” policy, as the original ratio is undefined.
-
Set limit-request ratio: Sets the limit as a multiplier of the recommended request according to the specified ratio (e.g., for recommended request of 500m and ratio of 2, the limit will be set to 1000m).
Dynamic Limit Caps (Optional Boundaries) Available in v1.29.14+
When using Keep limit-request ratio or Set limit-request ratio, optional hard bounds can be configured to prevent unlimited or runaway limit changes over time. These settings are available per resource (CPU, memory, and ephemeral storage).
| Setting | Description |
|---|---|
| Minimum limit | The limit will never be set below this value, regardless of rightsizing. The limit will also never decrease below the original limit. |
| Maximum limit | The limit will never exceed this absolute value, regardless of resource pressure. |
| Max Limit Increase Factor | Caps the limit relative to the original limit. For example, a factor of 3 with an original limit of 2 GiB means the limit will never exceed 6 GiB. |
When both a Maximum limit and a Max Limit Increase Factor are configured, the smaller of the two resulting values becomes the effective upper bound.
Example:
| Value | |
|---|---|
| Original memory limit | 2 GiB |
| Set limit-to-request ratio | 2× |
| Max Limit Increase Factor | 3 |
| Maximum limit | 2.5 GiB |
| Upper bound from factor | 3 × 2 GiB = 6 GiB |
| Effective upper bound | 2.5 GiB (smaller of 6 GiB and 2.5 GiB) |
Because the absolute maximum limit (2.5 GiB) is smaller than the factor-derived ceiling (6 GiB), the absolute maximum limit is the effective boundary. The limit will never exceed 2.5 GiB, regardless of continued resource pressure.
Automation Configuration

Automation Optimization Strategy
Define your optimization strategy per workload type.
| Strategy | Description |
|---|---|
| Ongoing | ScaleOps continuously updates resource requests to ensure the pods get the right amount of resources at all times (recommended) |
| Upon pod creation | ScaleOps updates container resources on new pod creation and never changes them later |
Recommended strategies by workload type:
| Workload Type | Recommended Strategy |
|---|---|
| Deployment | Ongoing |
| StatefulSet | Upon pod creation |
| DaemonSet | Upon pod creation |
| DeploymentConfig | Ongoing |
| Custom Workload | Upon pod creation |
In-Place Optimization Available in v1.17.0+
Optimize your workloads with no disruption to ensure high availability at all times. View in-place documentation
- Enable for ongoing automation strategy
- Enable for upon pod creation automation strategy
Note: ScaleOps always optimizes workloads based on the defined automation strategy and configuration. If in-place optimization is not feasible, optimization will fall back to the defined automation strategy for each workload type.
Java Memory Optimization (Legacy)
Java optimization automation is no longer controlled via policy settings. Use the dedicated Java automation controls (UI toggle, UI actions, or GitOps) instead. See Java Optimization for details.
This legacy setting enables JVM-aware memory optimization for Java workloads. For new deployments, use the dedicated Java automation controls described in the Java Optimization documentation.
| Option | Recommendation |
|---|---|
Enabled | Legacy - use Java automation controls instead |
Disabled | (default) |
Zero Downtime Rollout Strategies
Optimize workloads by first creating new optimized pods to ensure high availability with no-downtime, while respecting workload’s availability.
| Option | Recommendation |
|---|---|
Workloads with a single replica | Enabled (recommended) |
Workloads with multiple replicas | Disabled |
Ensure High Availability
ScaleOps optimization will consider Workload’s rollout strategy, PDBs, unevictable annotations, and will ensure minimum of 1 replica at all time.
| Option | Recommendation |
|---|---|
Ensure minimum of 1 replica | Enabled (recommended) |
Respect unevictable pods by annotation | Enabled (recommended) |
Readiness period buffer
Define a time buffer for pods to become ready, in addition to the existing workload readiness probe.
| Option | Recommendation |
|---|---|
5 seconds | (recommended) |
Update HPA Resource-Based Triggers
Update HPA CPU and Memory utilization based triggers according to the new container requests size.
| Option | Recommendation |
|---|---|
Enabled | (recommended) |
Disabled |
Actively enforce optimization according to context
Actively enforce optimization according to context of node stress, noisy neighbors and application context.
Optimization Upon Automation
Allow ScaleOps to rollout upon workload automation.
| Option | Recommendation |
|---|---|
Enabled | (recommended) |
Disabled |
Allowed Rollout Period
Determine when ScaleOps allows to optimize workloads.
Default: Days: S-M-T-W-T-F-S From: 00:00 to 23:59
Required Window Coverage
Define the percentage of data points in history window that is required for applying changes.
Note: low coverage will delay ScaleOps from applying optimizations.
| Percentage | Recommendation |
|---|---|
2% | (recommended) |
Range: 0% - 100%
Scheduling Configuration
Bin-Pack Unevictable Pods
Automatically bin-pack unevictable pods to allow nodes to be more consolidated, further reducing costs.
| Option | Recommendation |
|---|---|
Enabled | (recommended) |
Disabled |