Bot Protection module

Edition

Incubator (services)

Git

Git

Latest

1.0.1

Compatible with Magnolia 6.2+.

The Bot Protection module helps prevent malicious bot attacks on your server. The module allows you to configure various protection mechanisms. This document provides an overview of how to configure and use the module effectively.

This module is at the INCUBATOR level.

Installing with Maven

Maven is the easiest way to install the module. Add the following to your bundle:

<dependency>
  <groupId>info.magnolia.botprotection</groupId>
  <artifactId>bot-protection</artifactId>
  <version>1.0.1</version>
</dependency>

Configuration

To configure the Bot Protection Module, you can adjust the following settings within the BotProtectionModule class. Set the /server/filters/botProtection filter enabled = true.

Item Description

PATH PROTECTION

enabledPathRegexes

Enable or disable protection by path regex.

Default is false.

enabledPaths

Enable or disable protection by specific paths.

Default is false.

IP PROTECTION

enabledIps

Enable or disable protection by specific IP addresses.

Default is false.

enabledIpRegexes

Enable or disable protection by IP address regex patterns.

Default is false.

HEADER PROTECTION

enabledHeaders

Enable or disable protection based on HTTP headers.

Default is false.

REQUEST PARAMETER PROTECTION

enabledRequestParams

Enable or disable protection based on request parameters.

Default is false.

RATE LIMITING

enabledRateLimit

Enable or disable rate limiting for incoming requests.

Default is false.

cacheExpiredSeconds

Cache expiration time in seconds.

Default: 60 seconds (1 minute).

maxRequestPerTimeFrame

Maximum number of requests allowed per timeframe.

Default: 100.

timeFrameInSeconds

Timeframe in seconds.

Default: 60 seconds (1 minute).

maximumBuckets

Maximum number of rate limit buckets.

Default: 1000 buckets.

rateLimitByPath

Define rate limit conditions for specific paths.

'rateLimitByPath':
  '0':
    'maxRequest': 100
    'path': '/example'

ignoreRateLimit

Ignores rate limits.

Bucket counting still happens, but we don’t return a 429.

CUSTOM CONFIGURATION

You can customize protection rules by specifying which paths, IP addresses, headers, and request parameters to block. Use the following maps to define your custom rules.

blockedPathRegexes

Define regex patterns for blocking specific paths.

Default values
'blockedPathRegexes':
  '1': '.*\/wp-(.+)\/*'
  '10': '.*\/autodiscover(.+)\/*'
  '12': '.*\/nmaplowercheck(.+)\/*'
  '14': '.*\/xmlrpc*'

blockedPaths

Define specific paths to block.

blockedIpRegexes

Define regex patterns for blocking specific IP addresses.

blockedIps

Define specific IP addresses to block.

blockedHeaders

Define HTTP headers to block.

Default values
'blockedHeaders':
  'Referer': '(?i)(.*getNewsListCrawlData.*)|(.*192\.168.*)'
  'User-Agent': '(?i)(.*mj12bot.*)|(.*semrush.*)|(.*Sistrix.*)|(.*SEOkicks.*)|(.*jobs\.de-Robot.*)|(.*AhrefsBot.*)|(.*UnisterBot.*)|(.*DotBot.*)|(.*SearchmetricsBot.*)|(.*SurveyBot.*)|(.*SEOdiver.*)|(.*spbot.*)|(.*wotbox.*)|(.*meanpathbot.*)|(.*BacklinkCrawler.*)|(.*magpie-crawler.*)|(.*oBot.*)|(.*fr-crawler.*)|(.*BLEXBot.*)|(.*MegaIndex.*)|(.*CloudServerMarketSpider.*)|(.*trendictionbot.*)|(.*Exabot.*)|(.*careerbot.*)|(.*Lipperhey-Kaus-Australis.*)|(.*seoscanners.*)|(.*MetaJobBot.*)|(.*Spiderbot.*)|(.*LinkStats.*)|(.*JobboerseBot.*)|(.*ICCrawler.*)|(.*Plista.*)|(.*Domain
    Re-Animator Bot.*)|(.*turnitinbot.*)|(.*coccoc.*)|(.*um-IC.*)|(.*mindUpBot.*)|(.*sg-Orbiter.*)|(.*CCBot.*)|(.*Qwantify.*)|(.*Kraken.*)|(.*plukkie.*)|(.*SafeDNSBot.*)|(.*360Spider.*)|(.*HaosouSpider.*)|(.*rogerbot.*)|(.*OpenHoseBot.*)|(.*Screaming
    Frog SEO Spider.*)|(.*ThumbSniper.*)|(.*R6_CommentReader.*)|(.*ImplisenseBot.*)|(.*Cliqzbot.*)|(.*aiHitBot.*)|(.*trendictionbot.*)|(.*adscanner.*)|(.*WBSearchBot.*)|(.*Python\/3.5
    aiohttp.*)|(.*Toweya\.com.*)|(.*netEstate.*)|(.*BUbiNG.*)|(.*Linguee.*)|(.*sentibot.*)|(.*VelenPublicWebCrawler.*)|(.*DomainCrawler.*)|(.*rogerbot.*)|(.*IndeedBot.*)|(.*GarlikCrawler.*)|(.*Gosign-Security-Crawler.*)|(.*Siteliner.*)|(.*SabsimBot.*)|(.*ltx71.*)|(.*PetalBot.*)|(.*AspiegelBot.*)|(.*MauiBot.*)|(.*Sogou.*)|(.*barkrowler.*)|(.*Bytespider.*)|(.*Linespider.*)|(.*Baiduspider.*)|(.*TMMBot.*)|(.*validator\.nu.*)|(.*SeznamBot.*)|(.*seekport.*)|(.*BLP_bbot.*)|(.*alipesnews.*)|(.*Turnitin.*)|(.*semanticscholar.*)|(.*diffbot.*)|(.*mediatoolkit.*)|(.*startmebot.*)|(.*yandexbot.*)|(.*siteauditbot.*)|(.*magnus.*)'

blockedRequestParams

Define request parameters to block.

Default values
'blockedRequestParams':
  '1': '(?i)(''(''''|[^''])*'')|(\b(ALTER|CREATE|DELETE|DROP|EXEC(UTE){0,1}|INSERT(
    +INTO){0,1}|MERGE|SELECT|UPDATE|UNION( +ALL){0,1})\b)'

trueIpHeaders

Define list of headers to get true client IP Address.

'trueIpHeaders':
  '1': 'True-Client-IP'
  '2': 'cf-connecting-ip'
  '3': 'X-Forwarded-For'
  '4': 'Proxy-Client-IP'
  '5': 'WL-Proxy-Client-IP'
  '6': 'HTTP_CLIENT_IP'
  '7': 'HTTP_X_FORWARDED_FOR'
  '8': 'Fastly-Client-IP'

Usage

BotProtectionFilter

The BotProtectionFilter is an important part of this module and should be added to the filter chain. It is responsible for applying the configured bot protection rules to incoming requests. To use this filter, follow these steps:

  1. Ensure that BotProtectionModule is properly configured with the desired protection features and rules.

  2. Add the BotProtectionFilter to the filter chain through resource file /mgnl-config/bot-protection/config/config.server.filters.botProtection.xml.

    This filter is either enabled or not depending on the value of enabled field.
  3. Ensure that this node is placed immediately after the /server/filters/uriSecurity node. This rule is configured on class BotProtectionVersionHandler.

  4. Import the BotProtectionModule class and configure it according to your requirements.

    <dependency>
      <groupId>info.magnolia.botprotection</groupId>
      <artifactId>bot-protection</artifactId>
      <version>1.0.1</version>
    </dependency>
  5. Enable or disable protection mechanisms as needed.

  6. Define custom rules to block specific paths, IPs, headers, and request parameters.

  7. Configure rate limiting using the RateLimitConfig class and add rate limit conditions for specific paths.

  8. The module will now apply the specified protection rules to incoming requests, preventing malicious bot attacks depending on the value of the enabled field mentioned above.

Run quality check with SonarQube

  1. Run sonar instance:

    docker-compose -f sonarqube/sonar.yml up -d
  2. Run maven command to run the report.

    mvn clean install sonar:sonar -Dsonar.host.url=http://localhost:9001

Changelog

Version Notes

1.0.1

Add trueIpHeaders as configurable

1.0

Initial release of the module.

Feedback

Incubators

×

Location

This widget lets you know where you are on the docs site.

You are currently perusing through the Bot Protection module docs.

Main doc sections

DX Core Headless PaaS Legacy Cloud Incubator modules