Why is Kubernetes so hard?

In the spirit of https://jvns.ca/blog/2023/10/06/new-talk–making-hard-things-easy/ and Matt Surabian’s DevOps Days Boston talk “Teaching and Learning When Words No Good Sense Make” – why is Kubernetes so hard?

In a word, complexity, but let’s break that out a couple dimensions of that complexity

Overloading of terms

Overloading is when a single term can have multiple meanings. For example, What The Heck Is Ingress? <— a whole episode of the Kubernetes Unpacked podcast devoted to that question

Very long, very nested resources defined in yaml

We have a single Kubernetes Composition in our codebase that is 529 lines long and has 5 levels of nesting. So the level of Cognitive Complexity is high, putting a strain on working memory.

Distributed system

Kubernetes is a distributed system because it spreads workloads, storage, and operations across multiple machines, coordinating them to appear as one cohesive system from the user’s perspective. This distributed nature allows Kubernetes to provide its core benefits of scalability, resilience, and flexibility. However,

Developing distributed utility computing services, such as reliable long-distance telephone networks, or Amazon Web Services (AWS) services, is hard. Distributed computing is also weirder and less intuitive than other forms of computing because of two interrelated problems. Independent failures and nondeterminism cause the most impactful issues in distributed systems. In addition to the typical computing failures most engineers are used to, failures in distributed systems can occur in many other ways. What’s worse, it’s impossible always to know whether something failed.
https://aws.amazon.com/builders-library/challenges-with-distributed-systems/

Declarative syntax

Indirectness of Action:
- Unlike imperative languages where code flows sequentially, in Kubernetes, you specify a desired state. The system’s controllers then work to realize it. This indirect approach can make it difficult to pinpoint problems since you’re not commanding each step but relying on Kubernetes’ logic to interpret and act on your intent.
Asynchronous Behavior:
- While imperative code executes in a predictable sequence, Kubernetes’ actions often run asynchronously. After declaring a desired state, the system might not immediately reflect that outcome. And it can be hard to tell if something has failed or whether it’s still processing.
Error Reporting:
- In imperative programming, errors are usually tied to a specific action or code line. In Kubernetes, however, errors may be symptomatic of deeper issues. For example, a pod in a “CrashLoopBackOff” state might be due to various reasons, demanding a more in-depth examination of logs, events, and configurations to trace the root cause.

How do we make it easier?

There’s no one simple answer to this. But to start, I recommend:

COMMENT your Kubernetes!
learn the fundamentals: https://jvns.ca/blog/2017/06/04/learning-about-kubernetes/
When you get discouraged, take a break but don’t give up. You’re not alone in your struggles!

Fault Tolerant

Faults happen – build resiliently

Why is Kubernetes so hard?

How do we make it easier?

Leave a comment Cancel reply

How do we make it easier?

Share this:

Related

Leave a comment Cancel reply