Monitoring and logging

Magnolia Cloud integrates with Datadog for logging and monitoring services to help you troubleshoot your instances and ensure the quality of your website.

Datadog accounts

The URL to your Datadog account is found in the Additional Information > Log views URL section of the Cockpit.

metrics url

To access Datadog, log in via the link in the Cockpit using the credentials given to you by Magnolia.

We provide access to preconfigured dashboards, metrics, and monitors to help you understand how your setup is performing such as response time, requests, error pages, CPU, and memory performance.

Each subscription package can have up to 5 Datadog accounts. If you require additional accounts (5 maximum) or changes to your account administration, contact our Magnolia Cloud Helpdesk.

Dashboards

Dashboards are available for your integration, UAT, and live environments. Within the dashboard, you will find a visualisation of certain preconfigured metrics. which are activated by default for your live environment.

You incur additional costs to activate metrics for integration and UAT environments.

Some examples of the preconfigured metrics are:

  • Last Event - displays information regarding the last events such as a Creation or Deletion Event.

  • Traffic - displays site traffic information such as new and active connections.

  • Load Balancers - displays load balancing information such as response time and, HTTP codes, and request counts.

  • Instances - displays information such as CPU usage and heap size per Author and Public instance.

  • Debugging dash - displays information including but not limited to RDS database connections, change in CPUs, and read/write information.

Preconfigured dashboards

The following preconfigured dashboards are provided for you.

To access the dashboards, go to your Datadog dashboard list.
See Datadog’s documentation for help on navigating the dashboards.
Dashboard Notes

Overview Dashboard


overview dashboard

The overview dashboard provides a general overview of your site. It is a great starting point for developers looking to start troubleshooting any potential issues.

What’s in the dashboard?
  • Event notifications (such as backup events)

  • Site traffic

  • CPU usage

  • Heap size information

  • Load balancing information on both the public and author instances

  • Delta information (such as change in CPU from the previous week)

  • Debugging information (such as log count, read latency, and database connections)

Magnolia Stats


stats dashboard

The stats dashboard contains Java Virtual Machine (JVM) and Magnolia Java Management Extensions (JMX) metrics. This dashboards allows you to dig deeper on potential troubleshooting issues.

What’s in the dashboard?
JVM
  • Code Heap

  • Metaspace

  • Compressed Class Space

Magnolia JMX
  • Cache stats

  • Cache flushes

  • JCR Session Count

  • Publishing Statistics

SLO Dashboard


slo dashboard

The Service Level Objective (SLO) dashboard provides an overview on cumulative traffic, memory, and storage. The metrics shown here reflect the metrics specific to your contract and allows you to review and evaluate the status and health of your subscription.

What’s in the dashboard?
Traffic Summary (author and public)
  • Accumulated traffic

  • Traffic by timeline

  • Accumulated requests

  • Requests by timeline

  • HTTP 4xx errors (total and on timeline)

  • HTTP 5xx errors (total and on timeline)

Uptime
  • Public availability (30 day, 90 day, 1 year)

  • Author availability (30 day, 90 day, 1 year)

Resource usage
  • CPU usage author

  • CPU usage public

  • Current availalbe storage space (both public and author)

Metrics

Metrics are data that help you measure certain goals and objectives. Magnolia provides preconfigured Metrics that are specific to your deployment and Datadog renders those metrics via the Metrics Explorer page. The Metrics Summary page displays a list of your metrics reported to Datadog from Magnolia under a specified time frame: the past hour, day, or week. You can also search through your metrics by name or tag using the search boxes above the Metrics Summary list.

By default, Datadog retains metrics for 15 months. See Datadog Data Collection, Resolution, and Retention for more details.

Generate custom metric reports

The example scenario below instructions you on how to filter the report to see accumulated traffic over a chosen time period.

You can only generate reports based off the metrics you find in the Metrics Summary list. For more detailed instructions on how to use the Metrics Explorer, check out Datadog’s Metrics Explorer documentation.

  1. To generate the custom metrics report, first navigate to Metrics > Explorer.

    image

  2. From Graph, select the aws.applicationelb.processed_bytes option.

    If you do not see the aws.applicationelb.processed_bytes option, see your Metrics Summary List. If you do not see this on the Metrics Summary List, it is not a preconfigured metric for your deployment.

  3. From Over, choose your organization and your live environment. This is typically labelled organization:<name> and customerenvironment:live respectively.

  4. From One graph per, be sure to choose the activeresource:true option.

  5. On each graph, aggregate with the Sum of reported values.

  6. Choose your time period from the time selection option at the top of the page. For example, the last month.

  7. View your results. You may need to resize your generated graph to see the sum of the traffic. To do this, choose L or XL from the Graph Size tool.

    image image

Investigate HTTP 500 errors

If you are getting a high amount of HTTP 500 errors (Internal Server Error) in your Datadog dashboards, you should investigate the specific URLs that are causing the errors.

This brief tutorial guides you through investigating HTTP 500 errors to find those specific URLs.

Prerequisites

  • You must be a Magnolia Cloud customer.

If you are a Magnolia Cloud customer, you automatically have access to Datadog. See Datadog accounts more information.

Checking the dashboard

  1. Sign in to your Datadog account.

  2. Go to your Overview dashboard.

  3. Under Public instances, expand the HTTP 500 section.

    expand http 500

  4. Select the timeframe in which there is a spike in the HTTP 500 errors.

    http 500 spike

    Make a note (or copy) this timeframe. You’ll need it when checking the logs. See Datadog’s custom time frames for help if needed.

Checking the logs

  1. Go to your Logs in Datadog.

  2. Paste (or enter) the same timeframe from when you were checking the dashboard.

    This is typically located in the top right of the Datadog Logs page.
  3. Type source:elb, status:error, and @http.status_code:500 in the search/filter bar.

  4. To see the specific URL, click on the row you want to investigate and check http > ssl > url or http > http_details.

    specific row url

    For a cleaner view, add @http-url as a column. Go to Options at the top right of the table and ADD A COLUMN.

  5. Under Export, download the log file as a CSV file to help with further analysis.

    export logs

Logs can only be exported to 5000 lines. See Datadog’s Export page for more details.

Monitors

Magnolia preconfigures certain monitors which trigger alerts to help keep your system healthy and running well. The monitors are configured using AWS and the Datadog Agent. Currently, these monitors cannot be edited and are primarily intended for Magnolia Support usage. However, you can view the monitors on your Datadog account under Monitors > Manage Monitors.

The following monitors are currently preconfigured:

Monitor Description

Availability

Determines the availability of your site using the health check endpoint from within Magnolia.

CPU

Monitors high CPU usage.

This monitor triggers an alert when the average CPU over the last 10 minutes exceeds the predefined threshold.

Memory

Monitors memory usage.

This monitor triggers an alert when the average percentage of usable memory over the last 10 minutes is lower than the predefined threshold.

High Latency

Monitors latency.

This monitor triggers an alert when the response time over the last 10 minutes is higher than the predefined threshold. This is typically around 10 seconds.

RDS disk space (bytes)

Monitors database storage (in bytes).

This monitor triggers an alert when the free storage space over the last 15 minutes is lower than the predefined threshold.

RDS disk space (%)

Monitors database storage (percentage).

This monitor triggers an alert when the percentage of free storage space over the last 15 minutes is lower than the predefined threshold.

Tomcat Thread pool

Monitors the Tomcat thread pool.

This monitor triggers an alert when 25% or more of the maximum threads in tomcat were busy over the last 5 minutes.

Logs

Preconfigured logs are activated by default for your live, UAT and integration environments. By default, logs are stored for 15 days. If you require a longer storage period, contact our Helpdesk to discuss cold storage. Cold storage is not provided by default and comes at an extra charge. Datadog provides a feature for archiving the logs to S3 and the indexes can be rehydrated from there when needed.

Querying logs

You can query logs using the Log Explorer from within your Datadog account.

  1. Navigate to Logs > Search in Datadog.

  2. Set the timeframe from which you would like to filter the logs. This is found in the top ribbon on your Datadog page.

  3. Set the log filters you wish to apply. For example, Host and Status.

  4. The Datadog log explorer then displays your filtered results.

For full details on querying logs from within Datadog, see Datadog’s Logs Documentation.

Audit logs

In addition to Datadog logging, you can enable audit trail logs in Magnolia.

An audit trail allows an administrator to record user activity in the system. The audit trail typically captures the who, what, when and where. The default implementation is based on Log4j 2.

The audit trail logs can be filtered using the magnolia.audit source in the Datadog log stream.

By default, logs are stored for 15 days.
See Audit for more information.
Feedback