Can wget filter downloads by file timestamp?

The wget command line utility can absolutely filter downloads based on the modification timestamp of files on a remote server. By using specific built-in options, wget evaluates whether a local file already exists and compares its timestamp with the file on the server. If the server’s file is newer, the utility downloads it; otherwise, it skips the download, saving both time and bandwidth. This feature is particularly useful for mirroring websites, automating backups, and syncing data efficiently.

Understanding Time-Stamping in wget

By default, wget does not automatically check timestamps to skip files unless you explicitly tell it to do so. When you enable time-stamping, wget sends a special request to the server to check the Last-Modified header of the remote file before initiating a full download.

To enable this behavior, you use the -N (or --timestamping) option.

wget -N https://example.com/file.zip

When you run this command, wget takes the following actions:

Key Options for Time-Based Filtering

Beyond the standard -N flag, wget offers a few variations and complementary settings to fine-tune how it handles file timestamps.

1. The Timestamping Option (-N)

As the primary tool for this task, it is best used for recurring downloads or cron jobs where you only want updates.

wget --timestamping https://example.com/data.csv

2. Combining with Mirroring (-m)

If you want to back up an entire directory or website while respecting timestamps, use the mirror flag. The -m option automatically turns on -N (timestamping), along with infinite recursion and preservation of directory listings.

wget -m https://example.com/downloads/

3. Backing Up Local Files (--backup-converted)

When syncing, if you want to keep your old local files instead of overwriting them when a newer server file is found, you can pair your command with backup options so older versions are renamed rather than lost.

Important Limitations to Keep in Mind

While filtering by modification date is highly effective, its success relies entirely on the server’s configuration and protocol.