How to Reject File Extensions in Wget?
When downloading websites or bulk files using the wget
command-line utility, you often want to avoid downloading specific types
of files, such as heavy videos, images, or unwanted scripts. To achieve
this, wget provides a dedicated reject
flag that allows you to specify exactly which file extensions
to skip during the download process. Managing your downloads this way
saves both bandwidth and local storage space.
The Reject Flag: -R
or --reject
To exclude certain file extensions, you use the
-R (short form) or
--reject (long form) flag. This flag tells
wget to check the file extension of every resource it
encounters and ignore any that match your specified list.
Basic Syntax
wget -R alternative,extension,list URLCommon Usage Examples
- Rejecting a single file type: If you want to download a directory but skip all PDF files, you would run:
wget -R pdf http://example.com/files/- Rejecting multiple file types: You can exclude multiple extensions by separating them with a comma (and no spaces):
wget -R mp3,mp4,wav http://example.com/media/- Using wildcards: The reject flag also accepts wildcard patterns. For instance, if you want to reject files that end with a specific suffix or pattern, you can use asterisks:
wget -R "*back*" http://example.com/downloads/How it Works with Mirroring
The reject flag is particularly powerful when combining it with the
recursive download flag (-r) or the mirroring flag
(-m). While wget will still parse the pages to
find new links, it will automatically discard and refuse to save any
file to your hard drive that matches the blacklisted extensions in your
-R command.