Kubernetes fundamentals

As a key functional component to successfully using Magnolia PaaS, you need to understand the key concepts of Kubernetes and how it works.

Here, we’re going to highlight some important concepts with Kubernetes, visualize how they work together, and provide some other important bits along the way.

What is Kubernetes?

In short, Kubernetes is an open-source system that allows you to manage containerized workloads and services. But, you don’t have to hear it from me, take it from Kubernetes themselves.

Containers are a good way to bundle and run your applications. In a production environment, you need to manage the containers that run the applications and ensure that there is no downtime. For example, if a container goes down, another container needs to start. Wouldn’t it be easier if this behavior was handled by a system?

That’s how Kubernetes comes to the rescue! Kubernetes provides you with a framework to run distributed systems resiliently. It takes care of scaling and failover for your application, provides deployment patterns, and more. For example, Kubernetes can easily manage a canary deployment for your system.

— Kubernetes

Containers

Containers are the lowest level of the Kubernetes heiarchy. They are packages of applications or services that are bundled together with their execution environments. Containerized applications act the same whether they’re on a laptop or distributed server.

They are a very useful when it comes to CI/CD as they can be created and modified programmatically. You can add programs or applications inside a container to suit your needs.
containers

Docker provides the ability to package and run an application in a loosely isolated environment called a container.

There are several containers used in Magnolia PaaS, including:

  • Magnolia CMS web application is run in a container built from a Tomcat server image.

  • Magnolia CMS database is run in a container built from a PostgreSQL database image.

Pods

Pods are the smallest deployable units of computing that you can create and manage in Kubernetes.

A pod consists of one ore more application containers that share network and storage resources, so they are relatively tightly coupled. A pod can also contain init containers that are run during pod startup.

pods

Pods can specify a set of shared storage volumes. All containers in the Pod are able to access the shared volumes, allowing those containers to share data. Volumes also allow persistent data in a Pod to survive in case one of the containers within needs to be restarted.

In Magnolia PaaS, one pod contains the Magnolia CMS web application container (along with a bootstrap init container) and another pod contains the Magnolia CMS database (along with a Magnolia Backup container).

Notice there are two pods running on nodes for each Magnolia public or author instance (along with other platform internal pods).

Nodes

A node is a machine that runs docker containers in pods. In a hosted Kubernetes, a virtual machine like EC2 is used to run workloads.

Typically, a cluster has several nodes. Each node is managed by the control plane and contains the services necessary to run pods.
nodes

Specifically, each node runs the following:

  • Kubelets: An agent that monitors the state of the node, ensuring your containers are healthy.

  • Workloads: The containers and pods that hold your apps, as well as other types of deployments.

A node has resources capacity for CPU, memory and number of pods that can run.

Clusters

Clusters use the Kubernetes container-orchestration system to deploy, maintain, and scale Docker containers.

A cluster is a group of computers that work together as a single system, therefore a Kubernetes cluster consists of components that represent the control plane and a set of machines called nodes.

clusters

The core of Kubernetes' control plane is the API, which lets you query and manipulate the state of objects in Kubernetes. Kubernetes objects are persistent entities in the Kubernetes system. Kubernetes uses these entities to represent the state of your cluster.

In a hosted Kubernetes cluster, a geographical region and a virtual private network like a VPC is used to isolate resources.

How does it all work together?

  1. Containers are ready-to-run software packages. They contain the code and requisite runtime content as well as all other essential settings and system libraries.

  2. Pods are collections of containers.

  3. Nodes are the resources that house the pods and execute workloads.

  4. Clusters contain multiple nodes.

kubernetes overview

Other important terms

This section contains further important terms related to Kubernetes and Magnolia PaaS.

Workloads

Workloads are applications running on Kubernetes and consists of a set deployment rules for application scheduling, scaling, and upgrade.

When a pod gets created, the new pod is scheduled to run on a node in your cluster. The pod remains on that node until the pod finishes execution, the pod object is deleted, the pod is evicted for lack of resources, or the node fails.

In order to manage those escenarios in Kubernetes there are workload resources like deployments or stateful sets:

  • Deployments are used for Magnolia author web application pods, since they are stateless and can be replaced if needed.

  • Stateful sets are used for Magnolia public web application pods, and also for Magnolia database pods, since they both require a state to track.

Notice in either deployments or stateful sets, the number of replicas represents the number of pods running on nodes. For example, in order to add a new public instance, the workload’s replicas config would need to be increased by one.

Services

Services are a way to expose an application running on a set of pods as a network service. Kubernetes gives pods their own IP addresses and a single DNS name for a set of pods, and can load-balance across them.

When a workload is used to deploy an application it can create and destroy pods dynamically. Even though each pod gets its own IP address, the set of pods running in one moment in time could be different from the set of pods running that application a moment later. A service can help with this situation, using selectors/labels at pod config level instead of deployed IP addresses.

In Magnolia PaaS, there is a service for each Magnolia web application and database pods.

Ingress

Ingress exposes HTTP and HTTPS routes from outside the cluster to services within the cluster. Traffic routing is controlled by rules defined on the Ingress resource.

Services are assumed to have virtual IPs only routable within the cluster network.

An Ingress may be configured to give services externally-reachable URLs, load balance traffic, terminate SSL / TLS, and offer name-based virtual hosting. In Magnolia PaaS, NGINX Ingress Controller is used.

Volumes

Volumes are directories, possibly with some data in it, which are accessible to the containers in a pod.

On-disk files in a container are ephemeral, which represents a problem like loss of files when a container crashes or sharing files between containers running together in a pod.

Persistent volume (PV) is a piece of storage in the cluster that has been provisioned in the cluster. When a pod ceases to exist, Kubernetes destroys ephemeral volumes; however, Kubernetes does not destroy persistent volumes. For any kind of volume in a given pod, data is preserved across container restarts.

A persistent volume claim (PVC) is required for pods to use any persistent storage. A workload mounts a PVC, which refers to a PV, which corresponds to existing storage infrastructure. In a PVC a specific capacity is configured, so the correct PV is used.

ConfigMaps and secrets can be used as configuration files in a volume (as well as environment variables).
Feedback