Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does grep know it is writing to the input file?

If I try to redirect the output of grep to the same file that it is reading from, like so:

$ grep stuff file.txt > file.txt

I get the error message grep: input file 'file.txt' is also the output. How does grep determine this?

like image 277
iobender Avatar asked Feb 08 '15 04:02

iobender


2 Answers

According to the GNU grep source code, the grep check the i-nodes of the input and the output:

  if (!out_quiet && list_files == 0 && 1 < max_count
      && S_ISREG (out_stat.st_mode) && out_stat.st_ino
      && SAME_INODE (st, out_stat))   /* <------------------ */
    {
      if (! suppress_errors)
        error (0, 0, _("input file %s is also the output"), quote (filename));
      errseen = 1;
      goto closeout;
    }

The out_stat is filled by calling fstat against STDOUT_FILENO.

  if (fstat (STDOUT_FILENO, &tmp_stat) == 0 && S_ISREG (tmp_stat.st_mode))
    out_stat = tmp_stat;
like image 66
falsetru Avatar answered Nov 05 '22 10:11

falsetru


Looking at the source code - you can see that it checks for this case (the file is already open for reading by grep) and reports it, see the SAME_INODE check below:

  /* If there is a regular file on stdout and the current file refers
     to the same i-node, we have to report the problem and skip it.
     Otherwise when matching lines from some other input reach the
     disk before we open this file, we can end up reading and matching
     those lines and appending them to the file from which we're reading.
     Then we'd have what appears to be an infinite loop that'd terminate
     only upon filling the output file system or reaching a quota.
     However, there is no risk of an infinite loop if grep is generating
     no output, i.e., with --silent, --quiet, -q.
     Similarly, with any of these:
       --max-count=N (-m) (for N >= 2)
       --files-with-matches (-l)
       --files-without-match (-L)
     there is no risk of trouble.
     For --max-count=1, grep stops after printing the first match,
     so there is no risk of malfunction.  But even --max-count=2, with
     input==output, while there is no risk of infloop, there is a race
     condition that could result in "alternate" output.  */
  if (!out_quiet && list_files == 0 && 1 < max_count
      && S_ISREG (out_stat.st_mode) && out_stat.st_ino
      && SAME_INODE (st, out_stat))
    {
      if (! suppress_errors)
        error (0, 0, _("input file %s is also the output"), quote (filename));
      errseen = true;
      goto closeout;
    }
like image 42
Nir Alfasi Avatar answered Nov 05 '22 08:11

Nir Alfasi