How to Download Specific File Types with Wget?

The wget command-line tool is a powerful utility for downloading files from the web, but downloading an entire site when you only need specific file types can waste time and bandwidth. By using the accept (-A) and reject (R) flags, you can precise-tune your download requests to target only specific extensions like PDFs, JPEGs, or MP3s. This article provides a quick guide on how to filter your wget downloads by file extension, whether you are grabbing a single file type or mirroring a directory structure.

Using the Accept Flag for Specific Extensions

The most direct way to limit your downloads to specific file types is by using the -A (or --accept) option. This tells wget to only keep files that match the extensions you define.

To download only PDF files from a specific directory, use the following syntax:

wget -r -A pdf http://example.com/directory/

If you need to target multiple file extensions at once, such as both PDFs and JPEG images, you can separate the extensions with a comma:

wget -r -A pdf,jpg,jpeg http://example.com/directory/

Key Flags to Use with Extension Filtering

When filtering by extension, you will usually need to combine the accept flag with other options to make the command work efficiently:

Excluding Specific Extensions

Conversely, if you want to download everything except a certain file type, you can use the -R (or --reject) flag. This is useful if a site contains massive video or zip files that you want to skip.

wget -r -R mp4,zip http://example.com/directory/

A Practical Example

If you want to download all images (PNGs and JPEGs) from a page and save them directly into your current folder without creating nested subfolders, your command would look like this:

wget -r -nd -A png,jpg,jpeg http://example.com/gallery/