Algolia Search Index feeder
This is the Magnolia Algolia search indexing module. It allows you to feed magnolia managed content into an Algolia index. The module is fully event driven. Therefore incremental updates are sent to the index on events like publication
or depublication
.
Configuration
The Algolia indexer is based on magnolias Commands API utilizing the Hooks API module.
Like the hooks module it allows us to register one workspace per config.
If multiple workspaces have be indexed then one dedicated yaml file per workspace is required.
-
To do so, provide a light module like this in your
magnolia.resources.dir
:. ├── addAlgoliaFeeders (1) │ └── hooks (2) │ ├── feedDamWorkspaceData.yaml │ └── feedWebsiteWorkspaceData.yaml
1 Root light module level. 2 Notice the second level the hooks
directory is at within the light module. -
Configure each
hook
as follows:class: info.magnolia.indexfeeder.command.AlgoliaSearchIndexFeederCommandDefinition (1) description: Feeds the specified data below into the algolia search index. catalog: algolia (2) commandName: feedIndex (3) asynchronous: false (4) applicationID: "XXXXXXXXXX" (5) apiKey: "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" (6) ... trigger: workspace: website (7) actions: (8) - PUBLISH - UNPUBLISH ... indexName: example.com-company (9) useOneIndexPerWorkspace: true (10)
1 The heart of this module: The SearchIndexFeederCommand. 2 The same name as specified in the bootstrap-file. 3 The same name as specified in the bootstrap-file. 4 If this command needs to run asynchronously, the command above must be adapted to run its Jcr-Stuff
viaMgnlContext.doInSystemContext()
.5 Customize Your applicationID
as shown in the url after your login to Algolia.6 Customize Your Algolia apiKey
. Get this value from Algolia. We recommend that you use a custom API key rather than the default Admin API Key to perform the indexing process.7 Customize The name of the Magnolia workspace, which contains the data to be fed. 8 Here we specify the events when our command gets called. 9 Customize The index-name
in Algolia. This will be automatically created if absent.10 Customize If true
, theindexName
settings is suffixed with theworkspace
-name using this naming scheme:indexName + "_" + workspace
.- Example
-
To apply that for the example above,
example.com-company_website
would be created in Algolia.
If you choose
false
, theindexName
settings remains untouched, but the workspace name is used to prefix entries in Algolia to prevent name clashes.Algolia is using a special property
objectID
to identify index items.This is populated with the JCR node path (and prefixed with the workspace when
useOneIndexPerWorkspace
was set to `false`) to provide a better readability when it comes down to debugging the data.
Specify the content to be fed
contentSelectionAttribute: "*"
contentAttributesToPush:
"*":
- title
- description
Using this configuration approach, all content published in the specified workspace is pushed to algolia. |
Try to make sure you define properties which every content has in common. If the content lacks a specified property, it is omitted in the payload of the push.
Here you can filter out the specific data you wish to push into algolia. With the example configuration below, only pages of the website workspace based on the template "mtk2:pages/basic"-template
will be considered for algolia feeding:
contentSelectionAttribute: "mgnl:template"
contentAttributesToPush:
"mtk2:pages/basic":
- title
- description
Limitations
Please keep in mind that records can’t go beyond a certain size limit in Algolia. This limit might depend on your plan. See the Algolia pricing page for more details. If you try to index a record that exceeds the limit, Algolia returns the Record is too big error.
See also here for more details on limitations. The Algolia Search Index feeder module utilizes the algoliasearch-core 3.16.5 library. |
According to the limits of your actual plan, you can limit the size of rich text-properties in Magnolia with a maxlength setting.
|