Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does wc output different padding spaces depending on how stdin is connected?

Tags:

bash

stdin

pipe

wc

See the following two commands with output:

$ wc < myfile.txt
 4  4 34
$ cat myfile.txt | wc
      4       4      34

My understanding is that these two both connect the stdin of the wc process with the content stream of myfile.txt. But why is the output padded in one case, and not in the other? How does wc tell the difference between the two? Is it not just reading from stdin?

like image 813
cdjc Avatar asked Oct 11 '25 11:10

cdjc


1 Answers

Short answer: because with wc < myfile.txt, the wc program has direct access to the file, and can do things besides reading from it. Specifically, it can get the file's size (and it bases the output column width on that). With cat myfile.txt | wc, it can't do that, so it uses wide columns to make sure there's enough room.

Long answer: wc tries to provide nicely columnated output:

$ wc a.txt b.txt 
   6    6   88 a.txt
  60  236 1772 b.txt
  66  242 1860 total

In order to estimate how wide its columns need to be, the GNU version of wc runs stat() (or fstat()) on all of its input files (before actually reading them to get the detailed counts), and uses their sizes to determine how large the word/line/character counts might get, and hence how wide it might need to make the columns to have room for all those digits.

If it can't get any of the input files' sizes (e.g. because they're not plain files, but pipes or something similar), it "assumes the worst", and forces a minimum width of 7 digits. So anytime any of the inputs are pipes or anything like that, you're going to get at-least-7-character-wide columns.

Some examples:

# direct input via stdin
$ wc a.txt - <b.txt
   6    6   88 a.txt
  60  236 1772 -
  66  242 1860 total

# indirect input via cat and a pipe on stdin
$ cat b.txt | wc a.txt -
      6       6      88 a.txt
     60     236    1772 -
     66     242    1860 total

# direct via file descriptor #4
$ wc a.txt /dev/fd/4 4<b.txt
   6    6   88 a.txt
  60  236 1772 /dev/fd/4
  66  242 1860 total

# indirect input via cat and a pipe on FD #63
$ wc a.txt <(cat b.txt)
      6       6      88 a.txt
     60     236    1772 /dev/fd/63
     66     242    1860 total
like image 61
Gordon Davisson Avatar answered Oct 14 '25 21:10

Gordon Davisson