[NTLUG:Discuss] Saving a web site

Eric Waguespack ewaguespack at gmail.com
Tue Apr 22 22:14:02 CDT 2008


On Tue, Apr 22, 2008 at 8:00 PM, Steve Baker <steve at sjbaker.org> wrote:
> Kipton Moravec wrote:
>  > It is a lot of pages (over 100) and I could go page by page and save,
>  > but there has to be a better way.
>  > Kip

Here is a line I had saved, it disregards robots.txt and masquerades
as an actual browser (but you will feel terribly guilty if you use it)

wget \
   --tries=10 \
   --wait=1 \
   --random-wait \
   --waitretry=2 \
   --no-verbose \
   --user-agent='Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US;
rv:1.8.0.1) Gecko/20060111 Firefox/1.5.0.1' \
   --mirror \
   --convert-links \
   --force-directories \
   --protocol-directories \
   --execute robots=off \
   --no-parent \
   "$1"



More information about the Discuss mailing list