[NTLUG:Discuss] Combining data from multiple files
Greg Edwards
greg at nas-inet.com
Mon May 26 10:59:18 CDT 2003
Kenneth Loafman wrote:
> Michael P wrote:
>
>>>
>>> (They each just have a series of 3 or 4 line entries but neither of them
>>> are all inclusive. I want one file to have all entries but no
>>> duplicates.)
>>>
>>
>>
>> I'm a little too sleepy to work up the full command but what you are
>> looking for is the command "sort" or maybe a combination of cat, sort,
>> and
>> uniq.
>
>
>
> Lets take two lists, A & B, same format:
>
> To eliminate duplicates:
> cat A B | sort | uniq -u
>
> To eliminate singletons:
> cat A B | sort | uniq -d
>
> To find singletons in in A only:
> cat A B B | sort | uniq -u
>
> Similarly for any other set operation...
>
> ...Ken
>
The problem stated "3 or 4 line entries".
I'm not sure you'll be able to do this with just simple shell commands
unless the line counts for each entry are the same.
You may have to write a Perl, Python, Tcl, etc script to read each file
into an array, sort, drop dups, and write results.
--
Greg Edwards
New Age Software, Inc. - http://www.nas-inet.com
======================================================
Galactic Outlaw - http://goutlaw.nas-inet.com
The ultimate cyberspace adventure!
More information about the Discuss
mailing list