Architecture
Kubernetes architectures have many variations. For example, many
clusters run kube-apiserver, kube-scheduler, and kube-controller-
manager as containers. This means the control-plane may also run
a container-runtime, kubelet, and kube-proxy.
Rather than being concentrated in a single operations team that services other teams,
ops expertise will become distributed among many teams.
Each development team will need at least one ops specialist, responsible for the health
of the systems or services the team provides. They will be a developer as well, but they
will also be the domain expert on networking, Kubernetes, performance, resilience,
and the tools and systems that enable the other developers to deliver their code to the
cloud.
Developer Productivity Engineering
The point is that self-service has its limits, and the aim of DevOps is to speed up
development teams, not slow them down with unnecessary and redundant work.
Yes, a large part of traditional operations can and should be devolved to other teams,
primarily those that deploy code and respond to code-related incidents. But to enable
that to happen, there needs to be a strong central team building and supporting the
DevOps ecosystem in which all the other teams operate.
Instead of calling this team operations, we like the name developer productivity engi‐
neering. Some organizations call this role platform engineer or maybe even DevOps
engineer. The point is that these teams do whatever is necessary to help other software
engineering teams do their work better and faster: operating infrastructure, building
tools, busting problems.
And while developer productivity engineering remains a specialist skill set, the engi‐
neers themselves may move outward into the organization to bring that expertise
where it’s needed.
port forwarding
container registry
The kubectl tool is the primary way of interacting with a Kubernetes cluster.
The cluster’s brain is called the control plane, and it runs all the tasks required for
Kubernetes to do its job: scheduling containers, managing Services, serving API
requests, and so on
The control plane is actually made up of several components:
Node Components
Cluster members that run user workloads are called worker nodes.
Each worker node in a Kubernetes cluster runs these components: kubelet, kube-proxy, Container runtime
A rare, but entirely possible, kind of failure is losing a whole cloud availability zone.
Cloud vendors like AWS and Google Cloud provide multiple availability zones in
each region, each corresponding roughly to a single datacenter. For this reason, rather
than having all your worker nodes in the same zone, it’s a good idea to distribute
them across two or even three zones.
Trust, but verify
Although high availability should enable your cluster to survive losing some nodes,
it’s always wise to actually test this. During a scheduled maintenance window, or
outside of peak hours, try rebooting a worker and see what happens. (Hopefully,
nothing, or nothing that’s visible to users of your applications.) Then, if you can,
try rebooting a control plane node. See if you are able to continue running kubectl
commands while the node is down.
run less software and Undifferentiated heavy lifting
It’s probably fair to say that Fargate makes sense for simple, self-contained, long-
running compute tasks or batch jobs (such as data crunching) that don’t require
much customization or integration with other services. It’s also ideal for build con‐
tainers, which tend to be short-lived, and for any situation where the overhead of
managing worker nodes isn’t justified.
Maintaining Desired State
Kubernetes controllers continually check the desired state specified by each resource
against the actual state of the cluster, and make any necessary adjustments to keep
them in sync. This process is called the reconciliation loop, because it loops forever,
trying to reconcile the actual state with the desired state.
scheduler
https://jvns.ca/blog/2017/07/27/how-does-the-kubernetes-scheduler-work/
principle - declarativity
Because Kubernetes is inherently a declarative system, continuously reconciling actual
state with desired state, all you need to do is change the desired state – the Deploy‐
ment spec – and Kubernetes will do the rest. How do you do that?
resources are data
helm and artifact hub
managing resources
That’s where Kubernetes resource requests and limits come in. Kubernetes under‐
stands how to manage two kinds of resources: CPU and memory. There are other
important types of resources, too, such as network bandwidth, disk I/O operations
(IOPS), and disk space, and these may cause contention in the cluster, but Kubernetes
doesn’t yet have a way to describe Pods’ requirements for these.
overcommitting
Kubernetes allows resources to be overcommitted; that is, the sum of all the resource
limits of containers on a node can exceed the total resources of that node. This is a
kind of gamble: the scheduler is betting that, most of the time, most containers will
not need to hit their resource limits.
If this gamble fails, and total resource usage starts approaching the maximum
capacity of the node, Kubernetes will start being more aggressive in terminating
containers. Under conditions of resource pressure, containers that have exceeded
their requests, but not their limits, may still be terminated.
quality of service
Based on the requests and limits of a Pod, Kubernetes will classify it as one of the
following Quality of Service (QoS) classes: Guaranteed, Burstable, or BestEffort.
managing lifecycle
liveness probes
readiness probe
startup probe
practice?
This kind of readiness probe can be useful because if you want to take the container
temporarily out of service to debug a problem, you can attach to the container and
delete the /tmp/healthy file. The next readiness probe will fail, and Kubernetes will
remove the container from any matching Services. (A better way to do this, though,
is to adjust the container’s labels so that it no longer matches the service: see “Service
Resources” on page 61.)
Using Namespaces
Another very useful way of managing resource usage across your cluster is to use
namespaces. A Kubernetes namespace is a way of partitioning your cluster into
separate subdivisions, for whatever purpose you like.
For example, you might have different namespaces for testing out different versions
of an application, or a separate namespace per team. As the term namespace suggests,
names in one namespace are not visible from a different namespace.
This means that you could have a service called demo in the prod namespace, and a
different service called demo in the test namespace, and there won’t be any conflict.
Pods
A Pod is a collection of application containers and volumes running in the same execution environment. Pods, not containers, are the smallest deployable artifact in a Kubernetes cluster. This means all of the containers in a Pod always land on the same machine.
Each container within a Pod runs in its own cgroup, but they share a number of Linux namespaces.
distribution of containers
At first, it might seem tempting to wrap both the web server and the Git synchronizer
into a single container. After closer inspection, however, the reasons for the separa‐
tion become clear. First, the two containers have significantly different requirements
in terms of resource usage. Take, for example, memory: because the web server is
serving user requests, we want to ensure that it is always available and responsive. On
the other hand, the Git synchronizer isn’t really user-facing and has a “best effort”
quality of service.
wordpress and mysql example
scaling factor
different scaling strategies
resources
In general, the right question to ask yourself when designing Pods is “Will these containers work correctly if they land on different machines?” If the answer is no, a Pod is the correct grouping for the containers. If the answer is yes, using multiple Pods is probably the correct solution. Grouping multiple co-located and co-managed containers in a single Pod is a relatively advanced use case. You should use this pattern only in specific instances in which your containers are tightly coupled. Generally speaking, copying files into a container is an antipattern. You
really should treat the contents of a container as immutable.
Deployments
ReplicaSet
When we define a ReplicaSet,
we define a specification for the Pods we want to create (the “cookie cutter”) and a
desired number of replicas. Additionally, we need to define a way of finding Pods that the ReplicaSet should control. The actual act of managing the replicated Pods is an
example of a reconciliation loop. Such loops are fundamental to most of the design
and implementation of Kubernetes.
Reconciliation Loops
The central concept behind a reconciliation loop is the notion of desired state versus
observed or current state.
Resource Management
Resources are requested per container, not per Pod. The total
resources requested by the Pod is the sum of all resources reques‐
ted by all containers in the Pod because the different containers
often have very different CPU requirements.
Requests are used when scheduling Pods to nodes. The Kubernetes scheduler will
ensure that the sum of all requests of all Pods on a node does not exceed the capacity
of the node. Therefore, a Pod is guaranteed to have at least the requested resources
when running on the node. Importantly, “request” specifies a minimum. It does not
specify a maximum cap on the resources a Pod may use. To explore what this means,
let’s look at an example.
Service Discovery
While the dynamic nature of Kubernetes makes it easy to run a lot of things, it creates
problems when it comes to finding those things. Most of the traditional network
infrastructure wasn’t built for the level of dynamism that Kubernetes presents.
selector app=alpaca ? how to use selector in services?
If we look at the SELECTOR column, we see that the alpaca-prod service simply gives
a name to a selector and specifies which ports to talk to for that service. The kubectl
expose command will conveniently pull both the label selector and the relevant ports
(8080, in this case) from the deployment definition.
Furthermore, that service is assigned a new type of virtual IP called a cluster IP. This
is a special IP address the system will load balance across all of the Pods that are
identified by the selector.
Kubernetes provides a DNS service exposed to Pods running in the cluster. This
Kubernetes DNS service was installed as a system component when the cluster was
first created. The DNS service is, itself, managed by Kubernetes and is a great example
of Kubernetes building on Kubernetes. The Kubernetes DNS service provides DNS
names for cluster IPs.
load balancers
The examples that we have seen so far use external load balancers; that is, load
balancers that are connected to the public internet. While this is great for exposing
services to the world, you’ll often want to expose your application within only your
private network. To achieve this, use an internal load balancer. Unfortunately, because
support for internal load balancers was added to Kubernetes more recently, it is
done in a somewhat ad hoc manner via object annotations.
Other
kubectl
aliases https://ahmet.im/blog/kubectl-aliases/index.html
jq and playground with it
generating resource manifests
kubectl create deployment demo --image=cloudnatived/demo:hello --dry-run=client -o yaml >deployment.yamlIf you’ve been using imperative kubectl commands to manage your cluster up to
now, and you’d like to switch to the declarative style that we recommend in this book,
this is a great way to do it. Export all the resources in your cluster to manifest files
using kubectl with the -o flag, as shown in the example, and you’ll be all set.
diffing resources
contexts and namespaces
So what happens when you have more than one cluster? For example, maybe you
have a Kubernetes cluster on your machine for local testing and a production cluster
in the cloud, and perhaps another remote cluster for staging and development. How
does kubectl know which one you mean?
To solve this problem, kubectl has contexts. A context is a combination of a cluster, a
user, and a namespace
kubectx, kubens, krew, kube-ps1, kube-shell, Click,
kube-sh, Stern
your tools
But this
isn’t the best way to use containers, because then you don’t get the benefits of resource
isolation.
If processes don’t need to know about each other, then they don’t need to run in the
same container. A good rule of thumb with a container is that it should do one thing.
versioning practice
You should avoid using the latest tag when deploying containers in production as it
is harder to track which version of the image is running, and more difficult to roll back
properly.
init containers
https://www.baeldung.com/linux/kubernetes-init-containers
Labels
You know that Pods (and other Kubernetes resources) can have labels attached to
them, and that these play an important role in connecting related resources (for
example, sending requests from a Service to the appropriate backends).
Labels aren’t just used for connecting Services and Pods; you can use them directly
when querying the cluster with kubectl get, using the –selector flag:
DaemonSet
StatefulSet
Job
CronJob
HorizontalPodAutoscaler
The HPA uses another popular Kubernetes project called the Metrics Server for
getting the data it needs for making autoscaling decisions. You can install it following
the instructions in the metrics-server repo.
Although CPU utilization is the most common scaling metric, you can use any
metrics available to Kubernetes, including both the built-in system metrics like CPU
and memory usage, and app-specific service metrics, which you define and export
from your application (see Chapter 16). For example, you could scale based on the
application error rate or number of incoming requests per second.
Operators and Custom Resource Definitions (CRDs)
For example, if you wanted to create a controller object that sets up replicated,
high-availability MySQL database clusters in Kubernetes, how would you go about it?
The first step would be to create a CRD for your custom controller object. In order
to make it do anything, you then need to write a program that communicates with
the Kubernetes API. This is easy to do, as we saw in “Building Your Own Kubernetes
Tools” on page 132. Such a program is called an Operator (perhaps because it auto‐
mates the kinds of actions that a human operator might perform).
You can see lots of examples of Operators built and maintained by the community in
the OperatorHub.io site. This is a repository of hundreds of Operators that you can
install on your clusters, or just browse their code to get ideas for your building your
own Operators.
Ingress
While Services (see “Service Resources” on page 61) are for routing internal traffic
in your cluster (for example, from one microservice to another), Ingress is used for
routing external traffic into your cluster and to the appropriate microservice
Service Mesh
Kubernetes Ingress and Services may be all that you need for routing requests from
clients to your applications, depending on the complexity of your organization. But
there is also a growing interest in a newer concept commonly referred to as a service
mesh. A service mesh is responsible for managing more complex network operations
such as rate-limiting and encrypting network traffic between microservices. Service
mesh tools can also add metrics and logging for requests flowing through the net‐
work, keeping track of how long requests are taking, or tracing where a request
started and what path it took through the various microservices along the way. Some
service mesh tools can handle automatic retries of failed requests, and have the ability
to deny or block inbound or outbound requests as needed.
Universalization
one of the options (ways to think about ‘universalization’)
When we say “foundations,” what do we mean? The most important part to get
right is automation. Importantly, this includes both automation to deploy your
application(s), but also automation to create and manage the clusters themselves.
When you have a single cluster, it is consistent with itself by definition. However,
when you add clusters, you add the possibility of version skew between all of the
pieces of your cluster. You could have clusters with different Kubernetes versions,
different versions of your monitoring and logging agents, or even something as
basic as the container runtime. All of this variance should be viewed as something
that makes your life harder. Differences in your infrastructure make your system
“weirder.” Knowledge gained in one cluster does not transfer over to other clusters, and problems sometimes occur seemingly at random in certain places because of this
variability. One of the most important parts of maintaining a stable foundation is
maintaining consistency across all of your clusters.
Configuration
ConfigMaps
The ConfigMap is the primary object for storing configuration data in Kubernetes.
You can think of it as being a named set of key-value pairs that stores configuration
data. Once you have a ConfigMap, you can supply that data to an application either
by creating a file in the Pod, or by injecting it into the Pod’s environment.
phrases to sort and analyze
a file named kuard-pod.yaml and then using kubectl commands to load that manifest to Kubernetes.
extending
When you’re ready to move beyond scripting calls to the kubectl executable, the
Kubernetes client libraries provide a way to dive deep into the API to build an
operator, a monitoring agent, a new user interface, or whatever your imagination can
dream up.
resources
slack, forums, reddit?
projects
and hands-on experience
kube exercises
https://github.com/eficode-academy/kubernetes-katas
https://github.com/eficode-academy/kubernetes-katas/blob/master/cheatsheet.md
https://github.com/eficode-academy/helm-katas
kubekloud
https://github.com/kubernetes/community