
.png)
by Sai Siddharth
Running applications on Kubernetes often means dealing with tasks triggered by events or batch jobs — like processing queues, scheduled workloads, or interacting frequently with databases and APIs. Traditional Kubernetes scaling, using the Horizontal Pod Autoscaler (HPA), primarily relies on CPU and memory usage metrics. While this works great for compute-intensive workloads, it doesn’t align well with scenarios where your applications spend most of their time waiting on external responses or handling event-driven tasks.
In other words, your pods might stay idle or underused even as queues grow or external tasks pile up, forcing you to manually manage scaling during sudden spikes or increased workloads. You end up either provisioning more resources than necessary to handle unexpected spikes or suffering from poor responsiveness during peak loads.
This blog dives into how KEDA (Kubernetes Event-driven Autoscaling) paired with custom metrics can help you scale your Kubernetes workloads more accurately, efficiently, and cost-effectively based on actual workload demands.
KEDA is a lightweight component built specifically for Kubernetes to enhance its autoscaling capabilities. Unlike traditional HPA, KEDA scales your applications based on real-time events and custom-defined metrics, giving you the flexibility to scale precisely according to your workload needs.
KEDA has two main components:
Together, these components empower your Kubernetes clusters to scale dynamically and precisely based on the exact needs of your workloads.
KEDA comes with several built-in scalers for popular platforms such as AWS SQS, PostgreSQL, Elasticsearch, and more. While these built-in options simplify scaling, there are scenarios where having a custom metrics scaler becomes a huge advantage:
Consider implementing a custom metrics server, for instance using Python. This server securely interacts with your databases (through secure intermediaries like RDS Proxy) to gather essential metrics like pending tasks or queue lengths.
Here’s how the scaling process looks in action:

Getting started with KEDA is simple. Here’s how you can install it and define a scaling rule with a ScaledObject.
helm repo add kedacore https://kedacore.github.io/charts
helm repo update
helm install keda kedacore/keda - namespace keda - create-namespace
This installs a single KEDA operator pod in the keda namespace, which includes both the controller and the metrics adapter as sidecar containers.
Below is a sample ScaledObject for an email-notification deployment. This uses a custom metrics API to scale based on a count value.
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: email-notification # Name of the ScaledObject (acts as the deployment identifier for this scaling rule)
namespace: default
annotations:
scaledobject.keda.sh/transfer-hpa-ownership: "true" # Ensures KEDA manages HPA ownership
spec:
scaleTargetRef:
name: email-notification # Name of the Kubernetes Deployment to scale
minReplicaCount: 1 # Minimum number of pods
maxReplicaCount: 5 # Maximum number of pods
advanced:
horizontalPodAutoscalerConfig:
name: email-notification # Optional: custom HPA name
triggers:
- type: metrics-api
metadata:
targetValue: "10" # When 'count' exceeds this, scale up
activationTargetValue: "0" # Don't scale from 0 until value > 0
format: "json" # Expected format from metrics API
url: "http://custom-metrics-server.default.svc.cluster.local/count?service=email-notification"
valueLocation: "count" # JSON field name containing the count
useCachedMetrics: "true" # Enables caching to reduce metric server load
You can implement a simple Python-based metrics server using Flask or FastAPI. This server just needs to expose an endpoint that returns a JSON response like:
{ “count”: 27 }
KEDA will regularly poll this endpoint to determine whether to scale your pods.
By implementing KEDA with custom metrics, you’ll notice several immediate improvements:
KEDA’s ability to leverage custom metrics for scaling gives your Kubernetes applications the responsiveness, security, and efficiency they need, especially when traditional CPU or memory-based scaling falls short. For anyone running event-driven or batch-processing workloads in Kubernetes, adopting KEDA with custom metrics can deliver substantial improvements in resource management, cost efficiency, and operational flexibility.
You can also check this post on Medium.