Algolia Search Index Feeder

Edition

Incubator (services)

Issues

Git

Git

Latest

1.0.5

Compatible with Magnolia 6.2.x.

This is the magnolia Algolia search indexing module. It allows you to feed magnolia managed content into Algolia index. The module is fully event driven. Therefore incremental updates are send to the index on events like publication or depublication.

This module is at the INCUBATOR level.

Installing with Maven

Maven is the easiest way to install the module. Add the following to your bundle:

<dependency>
  <groupId>info.magnolia.algolia</groupId>
  <artifactId>algolia-search-index-feeder</artifactId>
  <version>1.0.5</version>
</dependency>

Configuration

The Algolia indexer is based on magnolias commands API utlizing the webhooks module. Like the webhooks module it allows us to register one workspace per config. If multiple workspaces have be indexed then one dedicated yaml file per workspace is required.

  1. To do so, provide a light module like this in your magnolia.resources.dir:

    .
    ├── addAlgoliaFeeders
    │   └── webhooks
    │       ├── feedDamWorkspaceData.yaml
    │       └── feedWebsiteWorkspaceData.yaml
  2. Configure each webhook as follows:

    class: info.magnolia.algolia.SearchIndexFeederCommandDefinition (1)
    description: Feeds the specified data below into the algolia search index.
    catalog: algolia (2)
    commandName: feedIndex (3)
    asynchronous: false (4)
    applicationID: "XXXXXXXXXX" (5)
    apiKey: "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" (6)
    ...
    trigger:
      workspace: website (7)
      actions: (8)
        - PUBLISH
        - UNPUBLISH
    ...
    indexName: example.com-company (9)
    useOneIndexPerWorkspace: true (10)
    1 The heart of this module: The SearchIndexFeederCommand.
    2 The same name as specified in the bootstrap-file.
    3 The same name as specified in the bootstrap-file.
    4 If this command needs to run asynchronously, the command above must be adapted to run its Jcr-Stuff via MgnlContext.doInSystemContext().
    5 Customize Your applicationID as shown in the url after your login to Algolia.
    6 Customize Your Algolia apiKey. Get this value from Algolia. We recommend that you use a custom API key rather than the default Admin API Key to perform the indexing process.
    7 Customize The name of the Magnolia workspace, which contains the data to be fed.
    8 Here we specify the events when our command gets called.
    9 Customize The index-name in Algolia. This will be automatically created if absent.
    10 Customize If true, the indexName settings is suffixed with the workspace-name using this naming scheme: indexName + "_" + workspace.
    Example

    To apply that for the example above, example.com-company_website would be created in Algolia.

    If you choose false, the indexName settings remains untouched, but the workspace name is used to prefix entries in Algolia to prevent name clashes.

    Algolia is using a special property objectID to identify index items.

    This is populated with the JCR node path (and prefixed with the workspace when useOneIndexPerWorkspace was set to `false`) to provide a better readability when it comes down to debugging the data.

Specify the content to be fed

  • Wildcard approach

  • Selective approach

Example config:
contentSelectionAttribute: "*"
contentAttributesToPush:
  "*":
    - title
    - description
Using this configuration approach, all content published in the specified workspace is pushed to algolia.

Try to make sure you define properties which every content has in common. If the content lacks a specified property, it is omitted in the payload of the push.

Here you can filter out the specific data you wish to push into algolia. With the example configuration below, only pages of the website workspace based on the template "mtk2:pages/basic"-template will be considered for algolia feeding:

Example config:
contentSelectionAttribute: "mgnl:template"
contentAttributesToPush:
  "mtk2:pages/basic":
    - title
    - description

Limitations

Please keep in mind that records can’t go beyond a certain size limit in Algolia. This limit might depend on your plan. See the Algolia pricing page for more details. If you try to index a record that exceeds the limit, Algolia returns the Record is too big error.

See also here for more details on limitations. The Algolia Search Index feeder module utilizes the algoliasearch-core 3.16.5 library.
According to the limits of your actual plan, you can limit the size of rich text-properties in Magnolia with a maxlength setting.

Changelog

Version Notes

1.0.5

1.0.4

1.0.2

Algolia review comments adapted.

1.0.1

1.0.0

Initial release.

Feedback