[NTLUG:Discuss] xargs guide
Christopher Cox
cjcox at acm.org
Sat Aug 1 10:42:08 CDT 2015
On 07/31/2015 07:33 PM, Steve Litt wrote:
> On Thu, 30 Jul 2015 07:24:26 -0500
> Pesto <dawjer at gmail.com> wrote:
>
>> As someone who has used xargs since the mid '90s I gotta say this is
>> well done. I learned stuff. Thanks.
>>
>>
>> pesto
>
> Thanks pesto!
>
> I just got finished incorporating about 20 improvements suggested by
> various people, including typos, omissions, and failure to properly
> identify the document's scope.
>
> Anyway, it's still at http://www.troubleshooters.com/linux/xargs.htm .
>
> Thanks,
>
> SteveT
>
Ok... some history from the "old guy" (and former sith lord)...
If you needed to supply a ton of arguments to a command for processing, your
command line "blows up". Let's say you have a directory containing 1 million
files and most end in *.txt (extreme just to illustrate):
The command:
$ ls *.txt
..is going to blow up. Why? Because the shell globbing pattern, *.txt, likely
expands to far too many values for the shell to handle. In the older days, this
problem actually happened pretty early on. Today with Linux, not really a huge
problem like it once was (getconf ARG_MAX, it's pretty big now). One way to tell
just how much your systems can handle is:
$ xargs --show-limits </dev/null
Your environment variables take up 3446 bytes
POSIX upper limit on argument length (this system): 2091658
POSIX smallest allowable upper limit on argument length (all systems): 4096
Maximum length of command we could actually use: 2088212
Size of command buffer we are actually using: 131072
(other things start breaking with a million arguments btw... not just the
command line length... number of arguments, but "ok" for this discussion to
igore for now).
So... what if you need to run a command with a huge number of arguments, or huge
command line?
xargs is BORN!
The idea is that xargs will excecute a command the number of times required
using the maximum amount of arguments allowed by your system.
The case Steve mentioned of using -n 1 (or GNUified, --max-lines=1) is usually
for specific cases. For example a command that only can take 1 argument.
But let's say the command I want to use is "grep". So if my data input (files
to grep) size is really really large, I would need to break things into multiple
greps.
$ find . -type f -name '*.txt' -print
The above tries to find all files starting with the current directory.
If you start that from root it might fail if trying to put the output as
arguments to grep:
$ grep 'my string' -- $(find . type f -print)
(given the crazy filenames we have now, this will likely fail for other reasons)
So, xargs to the rescue! (I'm avoiding find's -exec here to make a point)
$ find . -type f -print0 | xargs -0 grep 'my string' --
That command willl run grep for the maximum allowed number of argument possible
until the list of elements is exhausted.
You've seen my famous pipelining one liner for finding all text file and
searching before (well many of you have):
$ find . -type f -print0 | xargs -0 file | grep -i text | cut -f1 -d: | tr
'\012' '\000' | xargs -0 grep -n 'mystring' --
Now, when using Linux, all of what was done it that command might not be
necessary, but the above will likely work on really really old Linux as well as
systems having a "find" and "xargs" that handles the "-0" argument.
(on old typical Unix, the command one-liner gets really complex to handle the
lack of "-0" since you have to handle special file names via quoting and escaping)
Anyway, obviously you can use "xargs" for many many things... just wanting folks
to know one of the major reasons why it exists historically.
More information about the Discuss
mailing list