--limit-size option support. Nothing special: just skip any HTTP/FTP file with size bigger than set in --limit-size= while mirroring a website.
Size can be set as nnnM, nnnk (megabytes, kilobytes).
RATIONALE: quickly mirror a website for grepping and/or exploring website structure, while skipping big files like .exe, .pdf, archives, media files, etc.
Full patched 1.21.3 source tree
Patched and compiled for Ubuntu 20.04
Patched and compiled for Ubuntu 22.04
Patched and compiled (linux x64)
Full patched 1.13.4 source tree
Patched and compiled (cygwin, win32)
Patched and compiled (linux x86)
Make it exit on specific HTTP error (patch for 1.12):
--- http.c~ 2009-09-22 06:02:18.000000000 +0300 +++ http.c 2011-08-03 14:43:00.000000000 +0300 @@ -2673,6 +2673,8 @@ logprintf (LOG_NOTQUIET, _("%s ERROR %d: %s.\n"), tms, hstat.statcode, quotearg_style (escape_quoting_style, hstat.error)); + if (hstat.statcode==503) + exit(1); } logputs (LOG_VERBOSE, "\n"); ret = WRONGCODE;
Update from 2021:
Date: Sat, 1 May 2021 23:07:11 +0800 From: YAN Hui Hang (yanhuihang(@)126.com) Subject: Add a libnettle6 link to [wget]? (since that it was removed from the latest Ubuntu 20.04 package list) Hello, I find your works [https://yurichev.com/wget.html] very useful! Thanks a lot for that. The precompiled (linux x64 patched wget 1.18) requires libnettle6, which is removed from the latest Ubuntu 20.04 packages. I found that this version [https://packages.debian.org/stretch/libnettle6] works great, so maybe it is good add this link. Regards
When I searched for small demos/intros and wanted to mirror files.scene.org file archive, but get only small files, under 10k.