What is the Default Wget Recursion Depth?
When downloading a website recursively using the wget
command-line utility, the tool automatically follows links to a specific
maximum depth to prevent infinite loops and excessive bandwidth
consumption. By default, the default recursion depth limit for
wget is 5. This means that if you initiate a
recursive download using the -r or --recursive
flag without specifying a custom depth, wget will only
follow links up to five levels deep from the original URL. Understanding
this default behavior is crucial for ensuring you capture all necessary
web pages without unintentionally leaving out deeply nested content.
Understanding the 5-Level Depth Limit
To visualize how wget navigates a website by default,
consider the structure of a site’s directories and links:
- Depth 0: The initial URL provided in the command
(e.g.,
https://example.com). - Depth 1: Pages directly linked from the initial URL.
- Depth 2: Pages linked from Depth 1 pages.
- Depth 3: Pages linked from Depth 2 pages.
- Depth 4: Pages linked from Depth 3 pages.
- Depth 5: The final level of pages
wgetwill fetch by default. Any links found on Depth 5 pages will be ignored.
How to Change the Recursion Depth
If the default limit of 5 is either too shallow to capture the entire
website or too deep for your specific needs, you can easily modify this
behavior using the -l or --level option.
- Increase or Decrease Depth: To specify a exact
maximum depth, append
-lfollowed by the desired number. For example, to set the depth to 3 levels:wget -r -l 3 https://example.com - Infinite Recursion: If you want to download the
entire website regardless of how deeply nested the pages are, you can
set the depth to infinite. This is done by using
0orinfas the argument:wget -r -l 0 https://example.com
Note: Use infinite recursion with caution. On large websites, setting the level to infinite can result in massive downloads, high disk space usage, and potential IP blocking from the host server if the traffic resembles a denial-of-service attack.