Synchronization module

Collaboration Bundled: DX Core

Edition

DX Core

License

MLA

Issues

Maven site

Latest

2.0.1

The Synchronization module synchronizes a target Magnolia instance with a source instance. The module allows you to publish a large amount of content selectively. Only previously published content is transferred to the target instance. You can use the module to add content to new public instances without shutting down existing instances and impacting their ability to serve content.

The module traverses the node tree and publishes only previously published content. Content that was never published isn’t published during synchronization either. If content was versioned when it was published (Magnolia default behavior), the module publishes the last known version, making it possible to recover modified content.

The Synchronization module installs the Sync Instance app that allows you to manually synchronize content between author and one public instance.

Content publishing and content synchronization

Always wait with content publishing actions until all content synchronization tasks have been finished.

If you attempt to publish a page while the Synchronization module is mid-sync and hasn’t yet synchronized the page’s parent, the publishing process will fail.

Installing with Maven

Bundled modules are automatically installed for you.

If the module is unbundled, add the following to your bundle including your project’s <dependencyManagement> section and your webapp’s <dependencies> section. If the module is unbundled but the parent POM manages the version, add the following to your webapp’s <dependencies> section.

<dependency>
  <groupId>info.magnolia.synchronization</groupId>
  <artifactId>magnolia-synchronization</artifactId>
  <version>2.0.1</version> (1)
</dependency>
1 Should you need to specify the module version, do it using <version>.

To be able to use the Sync Instance app, add also the following module:

<dependency>
  <groupId>info.magnolia.synchronization</groupId>
  <artifactId>magnolia-synchronization-app</artifactId>
  <version>2.0.1</version> (1)
</dependency>
1 Should you need to specify the module version, do it using <version>.

Use cases

Public instance failure

Imagine that you have two public Magnolia instances. Due to hardware failure, one of them is out of operation. As you try to publish content during the outage, transactional publishing tells you that the content can’t be published because one of the public instances is down.

Since you really need to publish new content to the remaining public instance, you make a conscious decision to switch off the receiver that suffered the hardware failure. Now you can publish the content while waiting for the failed hardware to be replaced.

A few days later hardware on the failed public instance has been replaced and the server is up again. You re-enable the receiver so that all new content is published to both public instances. But you still have a problem with what to do with all the content that was published while the instance was down. Your options are:

  • Republish everything to both instances. This is a time-consuming process, generates high load, and slows down your site during publishing.

  • If you kept a list of the published pages, you know exactly what to publish to get them on both instances. This works well for small deployments with infrequent updates and a single editor.

  • Use the Synchronization module. Set the previously broken public instance as a synchronization receiver and Magnolia will take care of syncing the missing content.

Public instance blackout

All public instances are corrupted, broken or compromised. No instances exist to serve content. A small site can deal with this by creating a new public instance and publishing all content to it. This is difficult in large deployments that have many editors, where content has already been modified since the blackout took effect, and where some pages are not yet ready to be published across the site. Use the Synchronization module to publish any previously published versions of content, even if the content was modified further, and skip any pages that weren’t previously published.

High load

You have a sudden high load on your site. You need to add a new public instance to deal with the load.

  • You can’t shut down any of the existing public instances because you need them to deal with the load. This prevents taking a snapshot for cloning.

  • You can’t publish all content from the author instance to public instances since this would unnecessarily flush the cache on them and increase load when the servers are already busy.

The solution is to create a new empty public instance and use the Synchronization module to publish content only to that instance while the existing public instances keep serving content.

Configuration

The Synchronization module is configured at Configuration > /modules/synchronization-core.

Note that the mgnlSystem and mgnlVersion workspaces can’t be synchronized using the Synchronization module or Sync Instance tool.

Synchronization command

Synchronization is controlled by info.magnolia.synchronization.commands.SynchronizationCommand, which extends BaseActivationCommand and is registered in /modules/synchronization-core/commands/synchronization.

Synchronization command

Property Description

syndicator

optional

Registers the syndicator class.

     class

required

SilentXASyndicator performs synchronization of content without update to metadata.

     recursive

optional, default is false

Executes recursively when set to true.

receivers

required

Synchronization receivers configurations (see below).

Synchronization receivers

Configure the target instance as a receiver under /modules/synchronization-core/commands/synchronization/synchronize/receivers.

Configuring the receiver

And /modules/publishing-core/config/receivers looks like this:

Receiver configuration

Manual synchronization

The Sync Instance tool allows you to manually synchronize content between an author instance and a single public instance. The operation is performed asynchronously.

To perform a manual synchronization:

  1. Select the workspace.

  2. Type the path to be published.

  3. Optionally select:

    1. A date (content publishing date) which you want to synchronize from.

    2. The Recursively synchronize subnodes checkbox to publish all child nodes recursively

  4. Click Start. Synchronization starts within one minute of execution.

    You must wait for the job to start before starting another one. Otherwise you will overwrite the previous job.

  5. Click Refresh to see the current status of the synchronization.

Manual synchronization

Path examples:

Path

Repository

Recursive

Synchronizes

/

website

Yes

All website pages.

/travel/about/company

website

No

company page only.

/travel/about

website

Yes

All pages under about page.

/admin/jsmith

users

No

User jsmith only.

/admin

users

Yes

All admin level users.

Scheduling synchronization

You can schedule synchronization jobs using the Scheduler module. The purpose of scheduling isn’t to synchronize an instance repeatedly, because this leads to unnecessary flushing of the cache and increases load. The aim is to schedule the sync to occur at a convenient later time such as during low traffic volume.

To configure a synchronization receiver, copy the /modules/scheduler/config/jobs/demo node and edit its properties.

Configuring synchronization receiver

Property Description

<job name>

required

Name of the job, demo in our example.

     params

optional

Parameters passed to the command. info.magnolia.synchronization.commands.SynchronizationCommand takes the following parameters:

         path

required

Path to the content, for example /demo-project/about.

         recursive

optional, default is false

Set to true to synchronize the node and subnodes.

         repository

required

Workspace where the content resides, for example website.

     catalog

required

Name of the catalog where the command resides. SynchronizationCommand resides in the synchronization folder.

     command

required

Name of the command definition node, synchronize.

     cron

optional

Schedule that indicates the execution time, written as a CRON expression. In our example 0 0 * * * * will run the job every hour.

     description

optional

Job description.

     enabled

optional, default is true

Enables and disables the job.

Test the synchronize command un-scheduled first. If it runs correctly, schedule it to publish to a new public instance after one minute (CRON expression: 0 * * * * ?). If this works correctly too, point the receiver configuration to the out-of-sync target instance and modify the CRON schedule as required.

Feedback

DX Core

×

Location

This widget lets you know where you are on the docs site.

You are currently perusing through the DX Core docs.

Main doc sections

DX Core Headless PaaS Legacy Cloud Incubator modules