[NTLUG:Discuss] Offline browsing/mirroring utility
Neil Aggarwal
neil at JAMMConsulting.com
Thu May 25 16:58:16 CDT 2006
RW:
That is the example I tried to follow.
According to that page:
Limit spanning to certain domains---`-D'
The `-D' option allows you to specify the domains that will be followed,
thus limiting the
recursion only to the hosts that belong to these domains.
Obviously, this makes sense only in conjunction with `-H'.
A typical example would be downloading the contents of `www.server.com',
but allowing downloads from `images.server.com', etc.:
wget -rH -Dserver.com http://www.server.com/
Following their example, I tried this:
wget -rH -Dstartrek.com http://www.startrek.com
and it is still pulling pages from other domains including amazon.com
Any ideas?
Neil
--
Neil Aggarwal, JAMM Consulting, (214)986-3533, www.JAMMConsulting.com
FREE! Valuable info on how your business can reduce operating costs by
17% or more in 6 months or less! http://newsletter.JAMMConsulting.com
-----Original Message-----
From: Discuss-bounces at ntlug.org [mailto:Discuss-bounces at ntlug.org] On Behalf
Of Rev. wRy
Sent: Thursday, May 25, 2006 1:02 PM
To: NTLUG Discussion List
Subject: Re: [NTLUG:Discuss] Offline browsing/mirroring utility
On Thu, 2006-05-25 at 12:42, Neil Aggarwal wrote:
> RW:
>
> This looked interesting to me, so I tried doing this for grins:
>
> wget --convert-links --domains=startrek.com --exclude-domains amazon.com
-H
> -r <http://www.startrek.com> http://www.startrek.com
>
> It keeps going off and pulling down pages from other sites, including
> Amazon.com.
>
> Any ideas why this is happening?
I'm thinking it has to do with -H. See
http://www.delorie.com/gnu/docs/wget/wget_15.html for the details about
host spanning with wget.
RW
_______________________________________________
http://ntlug.pmichaud.com/mailman/listinfo/discuss
More information about the Discuss
mailing list