How to Block a Specific User Agent in Apache?
Web scrapers, malicious bots, and aggressive AI crawlers can drain
your server’s bandwidth and compromise your website’s security. This
article provides a straightforward guide on how to block specific user
agents and bots in Apache using both the .htaccess file and
the main server configuration. You will learn how to leverage the
mod_rewrite module to intercept unwanted traffic and return
a 403 Forbidden status code, protecting your server
resources from unauthorized automated scans.
Prerequisites: Enabling mod_rewrite
Before you can block user agents, you must ensure that Apache’s
rewrite engine is enabled. Most modern web hosting environments have
this active by default. If you are managing your own server, you can
enable it via the terminal by running sudo a2enmod rewrite
and restarting Apache.
Method 1: Blocking Bots Using the .htaccess File
The most common and flexible way to block a bot is by adding rules to
your website’s root .htaccess file. This method does not
require a full server restart and takes effect immediately.
To block a single bot (for example, a fictional malicious crawler
named BadBot), add the following lines to your
.htaccess file:
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} BadBot [NC]
RewriteRule .* - [F,L]How this code works:
RewriteEngine On: Initializes the rewrite module.RewriteCond %{HTTP_USER_AGENT}: Tells Apache to inspect the incoming User-Agent string sent by the browser or bot.BadBot: The specific text pattern you are looking for.[NC]: Stands for “No Case,” meaning the check is case-insensitive (it will catchbadbot,BadBot, andBADBOT).RewriteRule .* - [F,L]: Instructs Apache to match any requested URL (.*), take no action to change the URL (-), and immediately return a 403 Forbidden error ([F]) while stopping further rules from processing ([L]).
Method 2: Blocking Multiple User Agents Simultaneously
If you want to block several problematic bots at once, you can chain
them together using the | (OR) operator inside a single
regular expression.
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} (BadBot|SpamCrawler|EvilScraper) [NC]
RewriteRule .* - [F,L]In this scenario, if an incoming request contains “BadBot”, “SpamCrawler”, or “EvilScraper” anywhere within its User-Agent header, Apache will deny access.
Method 3: Blocking Bots Globally via Apache Configuration
If you have root access to the server and want to block a bot across
all websites hosted on that machine, you should place the rules inside
your main server configuration file (such as httpd.conf or
apache2.conf) or within the specific VirtualHost block.
<VirtualHost *:80>
ServerName yourwebsite.com
DocumentRoot /var/www/html
RewriteEngine On
RewriteCond %{HTTP_USER_AGENT} BadBot [NC]
RewriteRule .* - [F,L]
</VirtualHost>Note: Whenever you make changes directly to the main Apache configuration files or VirtualHost blocks, you must restart or reload the Apache service for the changes to take effect (e.g.,
sudo systemctl reload apache2).
Testing Your Blocked User Agent Configuration
After saving your configuration, you can test if the rule works by
using the curl command-line tool to spoof your user agent.
Run the following command from your terminal:
curl -I -A "BadBot" http://yourwebsite.comIf your Apache rules are configured correctly, the server will respond with a 403 Forbidden status code, confirming that the bot has been successfully blocked from accessing your content.