[NTLUG:Discuss] copying web documents

Fred fredstevens at yahoo.com
Thu May 18 17:20:31 CDT 2006


Thanks to all that helped. wget is a very versatile tool that gives far more
info than what is wanted, kinda like asking for the time and getting the
instructions for building the clock.

I liked the idea of searching the web for the pdf file as it had the advantage
of being quick and it produced the finished product. wget gave me a giant pile
of trash to sift through: unwanted directories, etc that had no connection to
the desired document. As for the robots.txt file including wget, who knows
why... the document is in the public domain.

To get this exercise to work properly requires a THOROUGH understanding of
wget and the desire to spend way too much time jacking with it. Maybe someone
can write a tool (bash, C, whatever) that will do the job. 

Thanks again,
Fred

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 



More information about the Discuss mailing list