[NTLUG:Discuss] Combining data from multiple files

Greg Edwards greg at nas-inet.com
Mon May 26 10:59:18 CDT 2003


Kenneth Loafman wrote:
> Michael P wrote:
> 
>>>
>>> (They each just have a series of 3 or 4 line entries but neither of them
>>> are all inclusive.  I want one file to have all entries but no
>>> duplicates.)
>>>
>>
>>
>> I'm a little too sleepy to work up the full command but what you are
>> looking for is the command "sort" or maybe a combination of cat, sort, 
>> and
>> uniq.
> 
> 
> 
> Lets take two lists, A & B, same format:
> 
> To eliminate duplicates:
>     cat A B | sort | uniq -u
> 
> To eliminate singletons:
>     cat A B | sort | uniq -d
> 
> To find singletons in in A only:
>     cat A B B | sort | uniq -u
> 
> Similarly for any other set operation...
> 
> ...Ken
> 

The problem stated "3 or 4 line entries".

I'm not sure you'll be able to do this with just simple shell commands 
unless the line counts for each entry are the same.

You may have to write a Perl, Python, Tcl, etc script to read each file 
into an array, sort, drop dups, and write results.

-- 
Greg Edwards
New Age Software, Inc. - http://www.nas-inet.com
======================================================
Galactic Outlaw        - http://goutlaw.nas-inet.com
   The ultimate cyberspace adventure!




More information about the Discuss mailing list