Pods restarting

Magnolia Documentation Team

DX Cloud master

Adyen Connector module
- master
AI Accelerator module
- 3.x
- 2.2
- 1.4
ai12z AI chatbot
- master
Algolia E-commerce connector
- master
Amplience DAM Connector module
- master
API
- master
- 1.1
Architecture Compass
- master
B-FY Connector module
- master
Backend Live
- master
Backup Extended module
- master
Bitbucket module
- master
Bot Protection module
- master
Bynder Universal Compact View Integration Module
- 3.0
- 2.2
- 1.2
Campaign manager module
- 5.0
- 4.0
- 3.1
Canto DAM connector
- 2.0
- 1.0
CDN Helper module
- 3.0
- 1.0
CDP integration framework
- 3.0
- 1.1
Celum DAM Connector module
- 4.0
- 2.1
Cloudinary External DAM module
- 3.0
- 2.1
- 1.3
Commenting module
- 2.0
- 1.1
Configuration Injection module
- master
Content Diff module
- 2.0
- 1.0
Content Exporter module
- 3.0
- 2.0
- 1.0
Content Locking module
- 3.0
- 2.0
Content Recommender module
- 3.0
- 2.0
Content Translation Extended module
- 5.0
- 4.2
- 3.6
Content Type models
- master
Content Types module
- 2.0.0
Content Usage
- master
Custom CSS module
- master
Customer Journey Mapping module
- master
DAM App module
- 6.0
- 5.0
DAM Focal module
- 3.0
- 2.4
DAM JCR Fastly renderer module
- master
DAM module
- 6.0
- 5.0
- 4.0
Dotdigital Integration module
- master
DX Cloud
- master
DX Cloud Cockpit
- master
DX Core
- 6.4
- 6.3
- 6.2
Dynamic Form module
- 3.0
- 2.0
- 1.2
E-commerce Category Sync
- master
E-commerce module
- 3.0
- 2.0
- 1.3
Eight Eye Workflow module
- master
Elasticsearch provider module
- master
Extended Health Check module
- master
Freeze module
- 4.0
- 3.0
- 2.0
Frontify DAM connector
- 3.0
- 2.0
- 1.0
Fullstory Integration module
- master
Groovy shell scripts
- master
- 6.2.55
Home
- master
Hooks API module
- master
Hybrid Assets module
- master
Image Focal module
- 2.0
- 1.0
Image placement module
- master
Image Recognition module
- 4.0
- 3.0
- 2.0
Imaging module
- 5.0
- 4.1
Incubator Modules
- master
Instrumentation module
- 3.0
- 2.0
internal
- master
Javascript Models
- 4.0
- 3.0
- 2.0
JavaScript UI module
- 4.0
- 3.1
- 2.2
Language Availability module
- master
Link Mapper module
- master
Linkmapper Shared Database module
- master
Live Copy module
- 5.x
- 4.x
- 3.x
Magnolia 5 UI documentation
- master
Magnolia CLI
- 5.x
- 4.x
Magnolia Cloud
- master
Magnolia Search Index Feeder module
- master
Magnolia Support documentation
- master
Magnolia Vercel App
- master
MediaValet DAM connector
- 1.0
Microsoft DAM Connector module
- master
Migration Tool module
- master
Multi Assets Upload module
- master
Multisite module
- 3.0.0
Netlify Integration module
- master
Orchestrate module
- 1.0-SNAPSHOT
Package Manager module
- 2.0.2
- 1.0.0
Page-editor apps extension
- 3.0
- 2.0
Performance tuning guide
- 6.4
- 6.3
Periscope Control module
- master
Piano Analytics Connector module
- 2.0
- 1.0
Public User Registration Database module
- master
Publication Task Config
- master
REST module
- 4.0
- 3.1
REST Proxy module
- 3.0
- 2.0
- 1.0
RMQ Publication module
- master
Salesforce B2B Commerce connector
- master
Salesforce Commerce Cloud B2B connector API Reference
- master
SearchStax integration module
- master
SEO module
- master
Shop module
- master
Site module
- master
Siteimprove module
- 3.1
- 2.1
- 1.3
Six Eye Workflow module
- master
Slack Integration module
- master
SSO Login Extension module
- master
SSO module
- 5.0
- 4.0
- 3.1
- 2.0
Task Email Notifications module
- 6.4
- 6.2
Tasks cleaner module
- 3.0
- 1.0
Throttling Filter module
- master
Two Factor Authentication module
- 3.0
- 2.0
- 1.0
URI Mapping app
- 2.0
- 1.2
URL Translation Module
- 6.4
- 6.2
Veeva DAM Connector module
- 2.0
- 1.1
Version Cleaner module
- master
VWO AB Testing module
- 3.0
- 1.0
Webhooks module
- 3.0
- 2.0
- 1.0
WeChat Login module
- 1.0
Workflow Extended module
- master

Pods restarting

Symptom

The pods are unexpectedly and frequently restarting.

Observations

You may receive an alert notification when a Magnolia pod is restarted like these:

Magnolia <pod namespace>/<pod> OOMKilled on <cluster>.
Magnolia instance <pod> on <cluster> is crashlooping.

Kubernetes can restart a Magnolia pod for a variety of reasons, but the most common reasons are:

The Magnolia pod has exceeded its memory limit (also known as a "Out of Memory Kill" or "oomkill" for short).
The Magnolia pod has not responded to its liveness or readiness checks.

You can determine how many times a Magnolia pod has been restarted by visiting the Rancher console and viewing your Magnolia pods or by using kubectl:

Rancher
kubectl

The Rancher console below displays the total number of Magnolia pods, 40 restarts for prod-magnolia-helm-public since there are two Magnolia public pods.

Pod restarts

The kubectl output shows the number of times each individual pod has been restarted.

$ kubectl -n <namespace> get pods
NAME                             READY   STATUS    RESTARTS   AGE
prod-magnolia-helm-author-0      2/2     Running   0          21d
prod-magnolia-helm-author-db-0   2/2     Running   0          78d
prod-magnolia-helm-public-0      2/2     Running   20         21d
prod-magnolia-helm-public-1      2/2     Running   20         21d
prod-magnolia-helm-public-db-0   2/2     Running   0          78d
prod-magnolia-helm-public-db-1   2/2     Running   0          78d

You can also see why the pod was restarted with kubectl:

$ kubectl -n <namespace> describe pods <pod>
<lots of info>

Look for the Containers: section in the output:

Containers:
  prod:
    Container ID:   docker://98f35fdc444d5a440c6240fe4fdcdf36d2512f9eb900fb9664ce412a07ffcb6e74
    Image:          tomcat:9.0-jre11-temurin
    Image ID:       docker-pullable://tomcat@sha256:5561f39723432f5f09ce4a7d3428dd3f96179e68c2035ed07988e3965
    Port:           <none>
    Host Port:      <none>
    State:          Running
      Started:      Tue, 06 Sep 2022 22:04:02 +0200
    Last State:     Terminated (1)
      Reason:       OOMKilled
      Exit Code:    137
      Started:      Tue, 06 Sep 2022 21:20:15 +0200
      Finished:     Tue, 06 Sep 2022 22:04:02 +0200
    Ready:          True
    Restart Count:  20
    Limits:
      memory:  12Gi
    Requests:
      memory:  12Gi

1	The `Last State` field notes when (`Tue, 06 Sep 2022 21:20:15 +0200`) and why (`OOMKilled`) the pod was restarted.

Next steps: Out of memory kills

If your Magnolia pod is being oomkilled, you can reduce the maximum heap memory for the JVM or limit the direct (off heap) memory used by the JVM.

Oomkills are often caused by off heap memory usage, reducing the maximum heap memory for the JVM will leave more memory for use by the JVM.

Increasing the memory available to the Magnolia instance and reducing the memory used by the JVM heap can prevent oomkills.

Specifying the maximum direct memory used by the JVM will limit off heap memory used by Magnolia and prevent the total heap memory and off heap memory used by Magnolia from exceeding the pod’s memory limit and being oomkilled.

Setting pod memory limits

Setting a higher memory limit for the Magnolia can reduce oomkills.

You can specify the memory limit for a Magnolia author or public instance in the Helm chart values by setting the value for magnoliaAuthor.resources.limits.memory or magnoliaPublic.resources.limits.memory.

We recommend:

at least 6Gi for the memory limit for Magnolia public instances
at least 8Gi for the memory limit for the Magnolia author instance

For more on these helm value references, see the Helm values page.

Setting the JVM maximum heap

The max heap size of the Magnolia JVM is a percentage of the pod’s memory limit. Setting a lower max heap percentage can reduce or eliminate oomkills by allowing more off-heap memory for use by the Magnolia pod.

You can set the percentage in the Helm chart values values with magnoliaAuthor.setenv.memory.maxPercentage or magnoliaPublic.setenv.memory.maxPercentage.

The default value is 80 percent. If your Magnolia instances are being oomkilled, we recommend reducing the max heap percentage to between 50 percent and 60 percent.

We recommend you set the maxPercentage so there is at least 2gb memory unused by the JVM heap maximum if not setting a direct memory limit for the JVM (see below).

Pod memory versus JVM maximum heap

The pod memory limit and the JVM maximum heap affect each other. Increasing or decreasing one will affect the other. So choose the pod memory limit and the JVM maximum heap percentage so that:

At least 6 gb of memory for a Magnolia public instance and at least 8 gb of memory for a Magnolia author instance
At least 2 gb of memory not used by the JVM heap

For example, for a Magnolia public instance, you could set:

pod memory limit: 6Gi
maxHeapPercentage: 50

The Magnolia instance would have a maximum of 3 gb for the JVM heap and 3 gb of memory for off heap use.

If 3 gigabytes for the JVM heap is not enough, you could set:

pod memory limit: 8Gi
maxHeapPercentage: 50

The Magnolia instance would have a maximum of 4 gb for the JVM heap and 4 gb of memory for off heap use.

If adjusting the memory limit and the max heap percentage do not stop oomkills of your Magnolia pods, we recommend setting a direct memory limit for the JVM.

Next steps: readiness and liveness checks failures

Kubernetes monitors Magnolia public pods and will restart a pod if it fails a readiness or liveness check.

The readiness and liveness checks are performed several times and Kubernetes will only restart a Magnolia pod if fails several consecutive checks.

Here’s a sample configuration for the liveness check for the Magnolia public instance:

magnoliaPublic:
  #...
  livenessProbe:
    port: 8765 # The default used by the bootstrapper.
    failureThreshold: 4
    initialDelaySeconds: 120
    timeoutSeconds: 10
    periodSeconds: 30

You can adjust these values in your Helm chart values to:

Allow more time for Magnolia to start up (liveness check only): initialDelaySeconds
Allow more failures of the liveness or readiness checks before restarting: failureThreshold
Allow more time for a liveness or readiness check to be performed: timeoutSeconds
Allow more time between liveness or readiness checks to be performed: periodSeconds

You can adjust these values to make liveness and readiness checks more lenient, but this will not necessarily stop Kubernetes from restarting a Magnolia pod.

Before changing the readiness and liveness checks, you should investigate why Magnolia isn’t starting up fast enough to satisfy the initialDelaySeconds threshold or is not responding to checks within the timeoutSecond threshold or is not responding over a period of time controlled by the failureThreshold and periodSeconds thresholds.

Feedback

PaaS

Pods restarting

Symptom

Observations

Next steps: Out of memory kills

Setting pod memory limits

Setting the JVM maximum heap

Pod memory versus JVM maximum heap

Next steps: readiness and liveness checks failures

Location

Main doc sections