HtmlCrossReferenceAuditor
Added in v5.6.6 .
|
The HtmlCrossReferenceAuditor
compares HTML elements within the same page. You can use HtmlCrossReferenceAuditor
to compare the meta keywords to page title or meta description.
The auditor uses jsoup queries to parse and find HTML elements. These queries have a jQuery or CSS like syntax.
HtmlCrossReferenceAuditor
uses two jsoup queries:
-
A source query to retrieve some text
-
A reference query to retrieve one or HTML elements to be checked against the source text
The source query also has a regular expression to extract the source text. The reference query also has a regular expression that checks the matched references; if the references match the regular expression, the audit is passed.
Here’s a quick example:
source query (sourceQuery): meta[name="keywords"]
A jsoup query to find the meta keywords element in a page.
source pattern (sourcePattern): .+content="([^,]+),.+
A regular expression to search the meta keywords element found by the source query and match the first keyword as match group 1.
The source query and the source pattern will get the first keyword defined in the meta keywords element.
reference query (referenceQuery): meta[name="description"]
A jsoup query will find the meta description in a page.
reference pattern (referencePattern):
A regular expression to check that the keyword found by the source query and pattern appears in the content attribute of the meta description.
Suppose the meta keywords element of the page is:
<meta name="keywords" content="beach,resort,island" />
The result of the source query and pattern is "beach". The result "beach" will be replaced in the reference pattern and applied to meta description content. If "beach" is found, the audit is passed, if not, the audit fails.
- Class
-
info.magnolia.services.seo.audit.impl.HtmlCrossReferenceAuditor
Properties
In addition to the common auditor properties, this auditor can be configured with the following properties:
Property | Description | ||
---|---|---|---|
level |
required Determines how a failed audit will be counted:
|
||
auditProperty |
required Defines the property name for storing failed audit results.
|
||
auditValue |
required Defines a message or explanation for a failed audit. The message can have placeholders that are replaced with information about the node and auditor:
|
||
sourceQuery |
required A valid jsoup query. See this cookbook for more on jsoup queries. |
||
sourceText |
optional Controls whether the source pattern will be applied to the HTML element found by sourceQuery (when set to false) or the text of the HTML element (when set to true). |
||
sourcePattern |
required A valid Java regular expression.
|
||
sourceFlags |
optional Match flags, as defined by The value must be a bit mask that may include Pattern.CASE_INSENSITIVE, Pattern.MULTILINE, Pattern.DOTALL, Pattern.UNICODE_CASE, Pattern.CANON_EQ, Pattern.UNIX_LINES, Pattern.LITERAL, Pattern.UNICODE_CHARACTER_CLASS and Pattern.COMMENTS. |
||
sourceGroup |
optional The index of the match group to use as source text.
|
||
referenceQuery |
required A valid jsoup query. See this cookbook for more on jsoup queries. The result of the reference query will be compared to the source query by the |
||
referenceText |
optional Controls whether the reference pattern will be applied to the HTML element found by |
||
referencePattern |
required A valid Java regular expression. The
|
||
referenceFlags |
optional The value must be a bit mask that may include Pattern.CASE_INSENSITIVE, Pattern.MULTILINE, Pattern.DOTALL, Pattern.UNICODE_CASE, Pattern.CANON_EQ, Pattern.UNIX_LINES, Pattern.LITERAL, Pattern.UNICODE_CHARACTER_CLASS and Pattern.COMMENTS. |
||
fetcher |
required Defines the content fetcher for the selected node. The query is then applied to the fetched content. There are two types of content fetchers available. |
Example
Here is an example from the SEO module. You can find this configuration here: /modules/seo/config/auditManager/auditors/titleRendered
checkMetadescriptionKeyword:
auditProperty: checkMetadescriptionKeyword
auditValue: The main keyword {0} is not used in the page metadescription
class: info.magnolia.services.seo.audit.impl.HtmlCrossReferenceAuditor
description: Check if the main keyword is used in the metadescription (pre-prod)
level: auditWarnings
referenceFlags: 2
referencePattern: <meta name="description" content=".'{'0,'}'{0}.'{'0,'}'">
referenceQuery: meta[name="description"]
sourceGroup: 1
sourcePattern: .+content="([^,]+),.+
sourceQuery: meta[name="keywords"]
fetcher:
class: info.magnolia.services.seo.audit.impl.RequestFetcher
targets:
localhost:
class: info.magnolia.services.seo.audit.impl.HostTarget
host: localhost
password: superuser
port: 8080
scheme: http
user: superuser