Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Redirect output of xargs to file

Tags:

find

bash

xargs

I want to delete the first line of every files of a directory and save the corresponding output by appending a '.tmp' at the end of each of the filename. For example, if there is a file named input.txt with following content:

line 1
line 2

I want to create a file in the same directory with name input.txt.tmp which will have the following content

line 2

I'm trying this command:

find . -type f | xargs -I '{}' tail -n +2 '{}' > '{}'.tmp

The problem is, instead of writing output to separate files with .tmp suffix, it creates just one single file named {}.tmp. I understand that this is happening because the output redirection is done after xargs is completely finished. But is there any way to tell xargs that the output redirection is a part of it's argument?

like image 347
Rafi Kamal Avatar asked Sep 28 '14 10:09

Rafi Kamal


People also ask

What can I use instead of xargs?

GNU parallel is an alternative to xargs that is designed to have the same options, but is line-oriented. Thus, using GNU Parallel instead, the above would work as expected.

How do I use xargs with grep?

Combine xargs with grepUse xargs with the grep command to search for a string in the list of files provided by the find command. The example above searched for all the files with the . txt extension and piped them to xargs , which then executed the grep command on them.

What is xargs file?

Xargs is a great command that reads streams of data from standard input, then generates and executes command lines; meaning it can take output of a command and passes it as argument of another command. If no command is specified, xargs executes echo by default.

What is difference between xargs and pipe?

pipes connect output of one command to input of another. xargs is used to build commands. so if a command needs argument passed instead of input on stdin you use xargs... the default command is echo (vbe's example). it breaks spaces and newlines and i avoid it for this reason when working with files.


2 Answers

Note you can use find together with -exec, without need to pipe to xargs:

find . -type f -exec sh -c 'f={}; tail -n+2 $f > $f.tmp' \;
                            ^^^^  ^^^^^^^^^^^^^^^^^^^^^
                              |   perform the tail and redirection
                  store the name of the file
like image 111
fedorqui 'SO stop harming' Avatar answered Oct 10 '22 00:10

fedorqui 'SO stop harming'


If you have GNU Parallel you can run:

find . -type f | parallel tail -n +2 {} '>' {}.tmp

All new computers have multiple cores, but most programs are serial in nature and will therefore not use the multiple cores. However, many tasks are extremely parallelizeable:

  • Run the same program on many files
  • Run the same program for every line in a file
  • Run the same program for every block in a file

GNU Parallel is a general parallelizer and makes is easy to run jobs in parallel on the same machine or on multiple machines you have ssh access to.

If you have 32 different jobs you want to run on 4 CPUs, a straight forward way to parallelize is to run 8 jobs on each CPU:

Simple scheduling

GNU Parallel instead spawns a new process when one finishes - keeping the CPUs active and thus saving time:

GNU Parallel scheduling

Installation

A personal installation does not require root access. It can be done in 10 seconds by doing this:

$ (wget -O - pi.dk/3 || lynx -source pi.dk/3 || curl pi.dk/3/ || \
   fetch -o - http://pi.dk/3 ) > install.sh
$ sha1sum install.sh | grep 883c667e01eed62f975ad28b6d50e22a
12345678 883c667e 01eed62f 975ad28b 6d50e22a
$ md5sum install.sh | grep cc21b4c943fd03e93ae1ae49e28573c0
cc21b4c9 43fd03e9 3ae1ae49 e28573c0
$ sha512sum install.sh | grep da012ec113b49a54e705f86d51e784ebced224fdf
79945d9d 250b42a4 2067bb00 99da012e c113b49a 54e705f8 6d51e784 ebced224
fdff3f52 ca588d64 e75f6033 61bd543f d631f592 2f87ceb2 ab034149 6df84a35
$ bash install.sh

For other installation options see http://git.savannah.gnu.org/cgit/parallel.git/tree/README

Learn more

See more examples: http://www.gnu.org/software/parallel/man.html

Watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1

Walk through the tutorial: http://www.gnu.org/software/parallel/parallel_tutorial.html

Sign up for the email list to get support: https://lists.gnu.org/mailman/listinfo/parallel

like image 20
Ole Tange Avatar answered Oct 10 '22 00:10

Ole Tange