[NTLUG:Discuss] pulling tables out of web pages.
Bobby Wrenn
bjwrenn at augustmail.com
Thu Apr 8 16:09:08 CDT 2004
Greg Edwards wrote:
> Bobby Wrenn wrote:
>
>> I have tried some html2txt tools and have had no success.
>>
>> I need to convert a web page into a tab delimited file (preferably
>> keeping only the data table). My goal is to do several of these pages
>> and cat them into a big table and delete duplicates.
>>
>> I think I can handle most of the problem if I can just convert the
>> html to a tab delimited text file.
>>
>> Anyone know of a reliable tool?
>>
>> Here is a sample of the web pages I am working on:
>> http://partsurfer.hp.com/cgi-bin/spi/main?sel_flg=partlist&model=KAYAK+XU+6%2F266MT&HP_model=&modname=Kayak+XU+6%2F266MT&template=secondary&plist_sval=ALL&plist_styp=flag&dealer_id=&callingsite=&keysel=X&catsel=X&ptypsel=X&strsrch=&pictype=I&picture=X&uniqpic=
>>
>>
>> TIA
>> Bobby
>
>
> If this is a one time deal? Read the file in with StarOffice Calc, then
> save as a comma delimited file (text CVS). Some of the other
> spreadsheet progs can do this as well.
>
> HTH
I have been using OOo with good results for the one offs. But now i have
24 files to process and just wanted a way do them all at once.
Ultimately, I would like to build a tool to go to the website pull down
the page and convert it and save the result to a file. This is (as
Usual) going to be a learning process for me.
Thanks,
Bobby
More information about the Discuss
mailing list