[NTLUG:Discuss] bash question
    Greg Edwards 
    greg at nas-inet.com
       
    Wed Sep 12 17:01:01 CDT 2001
    
    
  
"Wrenn, Bobby J." wrote:
> 
> I found a perl script on the Web that did the trick on the file name part.
> Then I pumped the files through:
> 
> for i in *.pdf ; do
>   DONAME=`basename $i .pdf`
>   pdftotext $i $DONAME.txt
> done
> 
> Success!
> 
> Now all I have to do is get the data out of the text files.
> 
> First major hurdle passed.
> 
> Thanks again,
> Bobby
> 
If the data in the text files contains a label on the same line as the
data you can use grep to extract just those lines.
ie: if "Address:" was a label
grep Address *.txt
would get something like
file1.txt: Address:  123 Somestreet
file2.txt: Address:  456 Anotherstreet
Pipe this to a file and use you DB load process to just concatenate and
save the fields after the second field using space as you delimieter.
-- 
Greg Edwards
New Age Software, Inc.
http://www.nas-inet.com
    
    
More information about the Discuss
mailing list