[NTLUG:Discuss] scripting foo help

Leroy Tennison leroy_tennison at prodigy.net
Fri Jul 3 23:50:38 CDT 2009


Fred James wrote:
> Dan Wright - SPCA of Texas wrote:
>>  
>>
>>
>>   
>>>> I have a CSV file that needs " added to the ID fields like bellow, 
>>>> right now there missing the " ( ex: 1263831 instead of:  
>>>>       
>>> "1263831" and 
>>>     
>>>> ID instead of: "ID"  )
>>>>  
>>>>       
>>> Here is a Perl solution.  It will read from data.csv and 
>>> print the result to stdout.  You can add '-i' if you want it 
>>> to modify the file inplace.
>>>
>>> perl -n -a -F"," -e '$F[0]=qq{"$F[0]"};print join(q{,}, @F);' data.csv
>>>
>>> If you have any newlines within your data fields, this will break. 
>>> You'd need a full csv parser to handle that.
>>>     
>> your foo is strong :) 
>>
>> but yes i would need to modify the file inplace so
>> where would i put the -i 
>>
>> thanks for your help.
>>
>> Dan
>>   
> Dan
> Caution is advised on area lakes -- any wind that blows away the 
> original file before the results can be verified will blow away any 
> chance of recovery - your millage may vary
> Regards
> Fred James
> 
> PS:  If it were me I think I might opt for AWK or SED, or at least EX - 
> sorry, personal opinion there - the thing is (from personal experience) 
> these types of files always seem to have some sort of gotcha that begs 
> for some sort of decision tree - best of luck
> 
> 
> _______________________________________________
> http://www.ntlug.org/mailman/listinfo/discuss
> 

Well, assuming that there are 17 columns (if I counted correctly) 
already comma-separated and none of them already have double quotes, 
here's an awk script:

BEGIN {FS=","; OFS="\",\""}
{print "\"" $1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, 
$15, $16, $17 "\"";}

Let's call it pet.awk for the sake of discussion and assume the input 
file is named pets.csv and the output file pets.out.  The command line is:

gawk -f pet.awk pets.csv > pets.out

It's not a particularly elegant script but it gets the job done.



More information about the Discuss mailing list