[NTLUG:Discuss] bash question

Wrenn, Bobby J. Bobby.Wrenn at banctec.com
Wed Sep 12 16:02:19 CDT 2001


I found a perl script on the Web that did the trick on the file name part.
Then I pumped the files through:

for i in *.pdf ; do
  DONAME=`basename $i .pdf`
  pdftotext $i $DONAME.txt
done

Success!

Now all I have to do is get the data out of the text files.

First major hurdle passed.

Thanks again,
Bobby

-----Original Message-----
From: Wrenn, Bobby J. [mailto:Bobby.Wrenn at banctec.com]
Sent: Wednesday, September 12, 2001 11:41 AM
To: 'discuss at ntlug.org'
Subject: [NTLUG:Discuss] bash question


If I can get an answer to this I will finally be able to use Linux at work.

I need to take 209 pdf files with spaces in the file names. and convert them
into text. I am very new to scripting and know nothing about regular
expressions. Is there an easy way to remove the spaces from the file names?
Then how do I recursively submit the files to pdftotext with the same name
except for the .pdf changed to .txt?

Just getting that much done will be a big help. The next step may be
trickier. I need to extract a name, address, and equipment list from each of
the files and get it into some kind of database where I can query for total
by item or item by location.

I'm a database beginner but a quick learner. TIA for any help. Please
contact me directly if this is not appropriate for the list.

TIA
Bobby Wrenn
Sr. Service Planner
BancTec, Inc.
972.450.7832 

_______________________________________________
http://www.ntlug.org/mailman/listinfo/discuss



More information about the Discuss mailing list