[NTLUG:Discuss] copying web documents
Fred
fredstevens at yahoo.com
Thu May 18 17:20:31 CDT 2006
Thanks to all that helped. wget is a very versatile tool that gives far more
info than what is wanted, kinda like asking for the time and getting the
instructions for building the clock.
I liked the idea of searching the web for the pdf file as it had the advantage
of being quick and it produced the finished product. wget gave me a giant pile
of trash to sift through: unwanted directories, etc that had no connection to
the desired document. As for the robots.txt file including wget, who knows
why... the document is in the public domain.
To get this exercise to work properly requires a THOROUGH understanding of
wget and the desire to spend way too much time jacking with it. Maybe someone
can write a tool (bash, C, whatever) that will do the job.
Thanks again,
Fred
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com
More information about the Discuss
mailing list