Kubernetes Resource Management
Understanding Kubernetes Resource Limits, Requests and how to configure them.
What’s the difference between “Requests” and “Limits”?
Kubernetes provides excellent controls around hard and soft limits for resource consumption. The “requested” amount is generally how much of some resource a pod is expected to consume. Consider this the “soft limit”; it gives the scheduler a hint so it can best figure out where to place the pod. The “limit” is the actual “hard limit”. This is the maximum amount of some resource that a pod can consume.
If a process consumes more than the “limit” of CPU, it will be throttled. If a process attempts to consume more memory than the “limit”, it will get an OOM error.
Warning OOMKilling Memory cgroup out of memory: Kill process 4481 (stress) score 1994 or sacrifice child
This will show up in the
kubectl status, like this:
NAME READY STATUS RESTARTS AGE some-pod 0/1 OOMKilled 1 24s
How can we tell if we’re oversubscribed on CPU/Memory?
By inspecting a “Node” in the kubernetes dashboard, it’s really easy to tell if a cluster is oversubscribed. In the example below, we can see that pods have requested 54% of available CPU, but the hard limit has been set to 100% of available CPU. This means, that nothing else should be scheduled to this node as 100% of total available capacity has been allocated.
In terms of memory, we see that all pods of a total memory limit of 4.3GB, which is 113% of available resources. This means the cluster is overcommitted in terms of the maximum permitted amount of memory. In terms of actually requested memory, we’re still under the threshold.
How can we tell if we have enough CPU/Memory Allocated?
Navigate to the kubernetes dashboard. Click the “Cluster” menu option on the left (it’s also the default view). At the top, it will show how much CPU and Memory is being consumed versus total available.
In the cluster below, you can see CPU is not constrained, but available memory is very low.
What’s the best way to view resources consumed by a namespace?
Navigate to the kubernetes dashboard. Click the “Overview” menu option on the left and then filter by the namespace. By default, the “default” namespace is selected. Generally, we don’t use this namespace.
If everything is running smoothly, you should see “Deployments”, “Pods” and “ReplicaSets” all at 100%, which means there are no failures.
Note, you can also select “All Namespaces” to view aggregate information for the cluster.
How can we set resource requests & limits?
It’s possible to restrict the total resources available to a namespace. * Who is responsible for this? DevOps ADmin * Who sets these limits?
Deployment / Pod Level
- Who is responsible for this?
- How do you do it? Go to helmfile which is capable of overriding the limits. Redeploy? Relationship to Codefresh
- We need docs on process for using