Extended Health Check module

Edition

Incubator (services)

Git

Git

Latest

1.0

Compatible with Magnolia 6.2, 5.7.

The Extended Health Check module provides extensible, configurable endpoint for evaluating the "health" of a Magnolia instance. You can use the endpoint for monitoring a Magnolia instance, either manually or automatically, for example for autoscaling.

You configure the values of the HTTP status returned in by the health check and configure the conditions that will be checked for a specific HTTP status.

The Extended Health Check module also provides a store for "health events", significant events that indicate something about the health of the Magnolia instance and can be checked in the extended health check.

You can collect health events from the Magnolia log with log4j configuration. And you can collect health events relating to Magnolia publication failures.

This module is at the INCUBATOR level.

Installing with Maven

Maven is the easiest way to install the module. Add the following to your bundle:

<dependency>
  <groupId>info.magnolia</groupId>
  <artifactId>healthcheck</artifactId>
  <version>1.0</version>
</dependency>

Usage

Health outcomes

Health outcomes define the conditions for which a specific HTTP status is returned by the extended health check.

A health outcome defines:

  • A voter set including one or more health voters or boolean voter sets checking Magnolia health conditions

  • Details returned if the conditions for the health outcome are met (HTTP status and description)

Health outcomes can be disabled (or enabled). A disabled health outcome won’t be examined when an extended health check is requested.

Health outcomes are defined through the module configuration at /modules/healthcheck/config/outcomes. You can add or modify the health outcomes defined there.

Node name Value

modules

     healthcheck
         config
             outcomes
                 <health outcome N>
                 <health outcome N>
                 <health outcome N>

Health outcomes are checked in the order they are defined; the first health outcome whose health voters return true is returned as the result of a health check, any remaining health outcomes are ignored.

Here are the configurable properties of a health outcome:

health outcomes

Health voters

Health voters check a single, specific condition about the health of a Magnolia instance. They can be combined with other health voters and boolean voter sets to form complicated logical expressions for a particular health outcome.

The Extended Health Check module includes several health voters:

ContextAvailableVoter

Checks if a Magnolia context is available

The Magnolia context is fundamental to Magnolia operation (unsurprisingly) and indicates a serious problem with Magnolia if one is not available.

ContextAvailableVoter has the following configuration:

Node name Value

<voter name>

The name of the voter.

     class

Should be info.magnolia.health.voters.ContextAvailableVoter

     enabled

true (the default if not specified) or false

If false, the voter will not be evaluated.

     not

true or false (the default if not specified)

If true, the result of the voter will be negated (e.g. !result).

HealthEventPropertyVoter

Checks for specified health events.

The HealthEventPropertyVoter checks whether specific health events exist meeting the configured criteria. You can also specify a threshold for the number of health events found, as well as the expected value of a health event property.

HealthEventPropertyVoter has the following configuration:

Node name Value

<voter name>

The name of the voter.

     class

Should be info.magnolia.health.voters.HealthEventPropertyVoter

     enabled

true (the default if not specified) or false

If false, the voter will not be evaluated.

     not

true or false (the default if not specified)

If true, the result of the voter will be negated (e.g. !result).

     identifier

The identifier of the health event.

Health events have the following identifiers:

* loggedMessage - the health event was created from a log message * publicationError - the health event was created from a publication error

NOTE: If not specified, the identifier will be loggedMessage.

     propertyName

(required) The name of the health event property whose value will be checked.

     propertyValue

(required) The expected value of the health event property.

     predicate

Specifies how the value of propertyName will be compared to the expected propertyValue.

The following comparisons are available:

* isDefined: property propertyName is defined in the health event * equals: propertyValue equals the actual property value * notEquals: propertyName is defined in the health event and propertyValue does not equal the actual property value * matches: propertyValue is a regular expression that matches the actual property value * doesNotMatch: property propertyName is defined in the health event and propertyName does not match the actual property value

     threshold

The number of health events matching the identifier, propertyName, propertyName and predicate. If more health events are found, the voter will return true, otherwise false.

If not specified, threshold will be 0.

     interval

Defines an interval in milliseconds from the current time when the health voter is checked for health events

Health events outside of the interval will not be checked.

Use interval limit the health events considered (e.g. publication errors within the last 30 minutes).

If interval is less than than 0, all health events will be checked (the default if not specified).

MagnoliaUpdatedNeededVoter

Checks Magnolia modules needing updating.

The MagnoliaUpdatedNeededVoter checks whether one or more Magnolia modules needs updating.

MagnoliaUpdatedNeededVoter has the following configuration:

Node name Value

<voter name>

The name of the voter.

     class

Should be info.magnolia.health.voters.MagnoliaUpdatedNeededVoter

     enabled

true (the default if not specified) or false

If false, the voter will not be evaluated.

     not

true or false (the default if not specified)

If true, the result of the voter will be negated (e.g. !result).

PublicationFailureVoter

Checks for Magnolia publication failures.

The PublicationFailureVoter checks whether a publication failure has occurred.

PublicationFailureVoter has the following configuration:

Node name Value

<voter name>

The name of the voter.

     class

Should be info.magnolia.health.voters.PublicationFailureVoter

     enabled

true (the default if not specified) or false

If false, the voter will not be evaluated.

     not

true or false (the default if not specified)

If true, the result of the voter will be negated (e.g. !result).

     interval

Defines an interval in milliseconds from the current time when the health voter is checked for publication failures

Publication failures outside of the interval will not be counted.

Use the interval to limit the publication errors considered (e.g. publication errors within the last 30 minutes).

If interval is less than than 0, all publication failures will be checked (the default if not specified).

     threshold

The number of publication failures within the specified interval counted. If more publication failures are found, the voter will return true, otherwise false.

If not specified, threshold will be 0.

QueryVoter

Checks for nodes defined in the JCR repository.

The QueryVoter checks whether nodes in the JCR repository are defined. This voter is useful for checking the messages workspace for system errors like the expiration of the Magnolia license.

QueryVoter has the following configuration:

Node name Value

<voter name>

The name of the voter.

     class

Should be info.magnolia.health.voters.HealthEventPropertyVoter

     enabled

true (the default if not specified) or false

If false, the voter will not be evaluated.

     not

true or false (the default if not specified)

If true, the result of the voter will be negated (e.g. !result).

     workspace

(required) The workspace that will be searched.

     query

A valid JCR SQL 2 query that will be evaluated in the workspace.

     threshold

The number of publication failures within the specified interval counted. If more publication failures are found, the voter will return true, otherwise false.

If not specified, threshold will be 0.

Health events

Health events are collected while Magnolia is running and provide a record that can be checked by health voters. There are two health voters - PublicationFailureVoter and HealthEventPropertyVoter - that use health events; the other voters - ContextAvailableVoter, MagnoliaUpdatedNeededVoter and QueryVoter - all check the state of Magnolia at the time of execution.

Health events are collected from two sources:

  • The Magnolia log

  • The results of Magnolia publications

Both sources can provide valuable insight in what has happened in a Magnolia instance outside of the time Magnolia’s health is being checked.

Health events have:

  • an identifier to indicate where the health event came from: "loggedMessage" for health events from Magnolia logging and "publicationError" from errors occurring during a Magnolia publication

  • name / value properties depending where the health event was collected

Health Log

Health events are stored in a health log and health voters can check the health log for matching their configuration to assess Magnolia’s health.

The health log can store a limited number of health events:

  • up to 10,000 total health events

  • health events older than 6 hours are discarded

Your health voters should not use intervals longer than 6 hours.

Collecting health events from Magnolia logs

You can collect health events from Magnolia logs and save them in the health log through Magnolia’s log4j configuration.

You will need set up two log4j elements:

  • A health log "Appender" to store any matching messages into the health log

  • One or more "Loggers" to select log messages to be saved by the health log appender

You can filter events by both the health log appender (using the "Filters" attribute) and the loggers (using the "level" attribute).

The health log appender is declared in the Extended Health Check module, you can use it in your log4j configuration without further declarations:

Here’s a sample health log appender:

<HealthMonitor name="license-monitor" messagePattern=".+">
  <PatternLayout>
    <PatternLayout pattern="%-5p %c %d{dd.MM.yyyy HH:mm:ss} -- %m%n"/>
  </PatternLayout>
</HealthMonitor>

This HealthMonitor appender will save any log message directed toward it (messagePattern will match any non-empty message) with the specified layout pattern.

HealthMonitor will save any matching log message to the health log with the following name / value properties:

  • logLevel: the log level of the message

  • logMessage: the log message

  • logThread: the thread where the message was logged

  • logName: the name of the Logger

  • logCallerFQCN: the fully qualified class name where the message was logged

Here’s some sample loggers that select log messages and send them to the HealthMonitor appender above:

<Logger name="info.magnolia.multisite.sites.MultiSiteManager" level="WARNING">
  <AppenderRef ref="license-monitor"/>
</Logger>
<Logger name="info.magnolia.sitemesh.config.MagnoliaConfigurableSiteMeshFilter" level="WARN">
  <AppenderRef ref="license-monitor"/>
</Logger>
WARN level

These loggers will select WARN level messages from the Magnolia Multi-Site module (specifically info.magnolia.multisite.sites.MultiSiteManager) and the Magnolia SiteMesh cacheing module (specifically info.magnolia.sitemesh.config.MagnoliaConfigurableSiteMeshFilter) and sends them to the HealthMonitor appender named "license-monitor". MultiSiteManager and MagnoliaConfigurableSiteMeshFilter both report expired licenses at WARN level.

Collecting health events from publications

Errors during a Magnolia publication are not completely captured in the Magnolia logs; the specific error message returned by a Magnolia public instance to the Magnolia author is not recorded in the log of the public instance. Knowing why a publication failed is an important indication of the health of a Magnolia public instance: if the publication failed because of some failure of the JCR repository, the JCR repository Magnolia public instance may be corrupted and the instance should be replaced or repaired. On the other hand, some publication errors may be recoverable, for example, publishing a child node whose parent has not been published will cause a publication error that can be remedied by publishing the parent node and republishing the child node.

Publication errors can be collected by a filter. The filter detects publication requests and saves the results of the publication into the health log.

The Extended Health Check module will install a filter "publishingMonitor" before the publication filter "publishing" to collect the result of publications.

If you change either the publishingMonitor filter or publishing filter, please not:

  • the publishingMonitor filter must be located before the publishing filter in the filter chain to collect publication results

  • the publishingMonitor filter should have the same bypasses configuration as the publishing filter to identify publication requests

If you don’t want to collect publication results in the health log, you can disable the publishingMonitor filter (set its enabled property to false) or delete the publishingMonitor filter.``

Health outcomes provided

The Extended Health Check module includes a number of health outcomes defined:

Name HTTP status Description

error500

500

Magnolia has internal errors.

Couldn’t get a Magnolia context.

error501

501

Magnolia has internal errors.

One or more Magnolia modules needs to be updated.

error503

503

Magnolia public instance has publishing failures.

One or more publication errors was found in the health log.

error402

402

Magnolia license has expired!

One or more licensed expired messages were found in the messages workspace or one or more license expired log messages was found in the health log.

errorTest

502

Test health error (Magnolia is really OK).

A test outcome (will always be returned) for testing the health check endpoint.

This outcome is disabled on installation of the Extended Health Check module.

Changelog

Version Notes

1.0

Initial release of the module.

Feedback

Incubators

×

Location

This widget lets you know where you are on the docs site.

You are currently perusing through the Extended Health Check module docs.

Main doc sections

DX Core Headless PaaS Legacy Cloud Incubator modules