How Does wget Check If Remote File Is Newer?

When timestamping is enabled via the -N (or --timestamping) option, wget determines if a remote file is newer than a local file by comparing their modification times and sizes. Instead of downloading the entire file upfront, wget issues an initial request to gather metadata from the remote server. By evaluating specific HTTP headers or FTP file listings against the local file’s attributes, it decides whether a fresh download is necessary or if the local copy is already up to date.

The Mechanism for HTTP/HTTPS

For web-based downloads, wget leverages standard HTTP protocol headers to make its decision without wasting bandwidth.

1. The HEAD Request

Instead of a standard GET request, wget starts by sending a HEAD request to the server. This asks the server to return only the response headers, omitting the actual file content.

2. Inspecting the Headers

Once the server responds, wget looks for two critical pieces of information:

3. The Comparison Logic

wget compares these values against the local file’s metadata. It will proceed to download the remote file only if:

If the local file is newer or identical in both time and size, wget skips the download, saving time and data.

The Mechanism for FTP

When downloading from an FTP server, the process relies on different commands since HTTP headers are not available.

1. Checking File Attributes

wget attempts to retrieve the remote file’s modification time using FTP commands like MDTM (Modification Time). If the server supports it, this provides an exact timestamp. If MDTM is unsupported, wget parses the standard directory listing to find the file size and date.

2. Comparison and Local Preservation

Similar to HTTP, wget checks if the remote timestamp is newer or if the file size has changed. If a download does occur, wget updates the local file’s modification time to match the remote timestamp, ensuring future checks remain accurate.