According to "read -N" description in manual page:
-N nchars return only after reading exactly NCHARS characters, unless EOF is encountered or read times out, ignoring any delimiter
However, in answer to following command:
$ echo 'a b' | while read -N1 c; do echo ">>>$c<<<"; done
>>>a<<<
>>><<<
>>>b<<<
>>><<<
both, space and newline have been translated into empty string, while in the command:
$ echo 'a b' | while IFS= read -N1 c; do echo ">>>$c<<<"; done
>>>a<<<
>>> <<<
>>>b<<<
>>>
<<<
space and newline have been stored correctly in the variable.
So, it seems delimiters still has some processing in "read" or "while" command, that I do not understand.
We could compare these results with the ones using "read -n", that manual described as:
-n nchars return after reading NCHARS characters rather than waiting for a newline, but honor a delimiter if fewer than NCHARS characters are read before the delimiter
$ echo 'a b' | while read -n1 c; do echo ">>>$c<<<"; done
>>>a<<<
>>><<<
>>>b<<<
>>><<<
$ echo 'a b' | while IFS= read -n1 c; do echo ">>>$c<<<"; done
>>>a<<<
>>> <<<
>>>b<<<
>>><<<
while IFS= read -r line; do printf '%s\n' "$line"; done < input_file. How does it work? The input file ( input_file ) is the name of the file redirected to the while loop. The read command processes the file line by line, assigning each line to the line variable. Once all lines are processed, the while loop terminates.
For many command line interpreters (“shell”) of Unix operating systems, the input field separators variable (abbreviated IFS, and often referred to as internal field separators) refers to a variable which defines the character or characters used to separate a pattern into tokens for some operations.
The while loop syntax IFS is used to set field separator (default is while space). The -r option to read command disables backslash escaping (e.g., \n, \t). This is failsafe while read loop for reading text files.
IFS is a variable for the line separator (or actually "Internal Field Separator"). That code will effectively empty out the line separator for your read command and set it to its default.
This is POSIX behaviour. When assigning to a variable, IFS characters should be stripped: the results shall be split into fields as in the shell for the results of parameter expansion (of course, -n and -N are not POSIX).
This is born-out by the read
source code comments:
/* This code implements the Posix.2 spec for splitting the words
read and assigning them to variables. */
orig_input_string = input_string;
/* Remove IFS white space at the beginning of the input string. If
$IFS is null, no field splitting is performed. */
In my opinion, while using option -N
, the behavior of read
is different when
When it's reading a character, a delimiter treats as same as a non-delimiter and read
will count them. But, when read
is assigning the delimiter, it considers that if the read input is a delimiter or not, if it's a delimiter it assigns a null to the corresponding variable.
So, IFS=
will change the behavior of assigning a white-space to a variable and causes a space to be assigned to c
rather than a null.
Using hexdump
allows us to see exactly the characters making up the output, so it may be helpful to slightly change your queries:
(1) With normal IFS and using -N option
$ (echo 'a b' | while read -N1 c; do c="$c<"; echo -n "$c"; done | hexdump -C)
00000000 61 3c 3c 62 3c 3c |a<<b<<|
00000006
In this first case, the read builtin for both 0x0a
and the space character returns the empty string, as characters are in the default IFS and characters in the IFS are ignored in the output for the reason explained in cdarke's answer.
(2) With empty IFS and -N option
$ (IFS=""; echo 'a b' | while read -N1 c; do c="$c<"; echo -n "$c"; done | hexdump -C)
00000000 61 3c 20 3c 62 3c 0a 3c |a< <b<.<|
00000008
In this case, the read builtin will match each of the four characters that the echo command outputs, and both 0x0a
and a space are seen in the output, because with an empty IFS the characters read can be assigned to the local variable c
.
(3) With normal IFS and -n option
$ (echo 'a b' | while read -n1 c; do c="$c<"; echo -n "$c"; done | hexdump -C)
00000000 61 3c 3c 62 3c 3c |a<<b<<|
00000006
This gives just the same output as case (1), although the semantics are a bit different: the read builtin for both 0x0a
and the space character return the empty string, as (i) both of these characters are in the default IFS and (ii) the -n option to the read builtin in any case does not pass on the trailing 0x0a
character
(4) With empty IFS and -n option
$ (IFS=""; echo 'a b' | while read -n1 c; do c="$c<"; echo -n "$c"; done | hexdump -C)
00000000 61 3c 20 3c 62 3c 3c |a< <b<<|
00000007
Here we observe a difference between the -n and -N options to read: with the -n option, the newline is treated specially by the read builtin and dropped, hence the exclusion of 0x0a
from IFS doesn't have an opportunity to allow it to be passed to the local variable c
.
read
cannot decide if a character is a delimiter (to ignore it) until it has already read the character, and read
must assign some value to c
, even if that value is the empty string. When a delimiter is read and subsequently discarded, the value of c
must be set to something, so it is assigned the empty string.
This is consistent with read
used without the -n/-N
options; delimiters are only discarded after they are read and if they aren't necessary to set the value of the provided parameter(s). The simplest case is when you don't provide any arguments to read
:
$ read <<< " a b c "
$ echo ">>>$REPLY<<<"
>>> a b c <<<
With a single explicit argument, leading and trailing delimiters are stripped:
$ read line <<< " a b c "
$ echo ">>>$line<<<"
>>>a b c<<<
With two arguments, the first delimiter is ignored once it has been read. The second is retained, because the string only needs to be split into two words to fill the provided parameters.
$ read field1 field2 <<< " a b c """
$ echo ">>>$field1<<<"
>>>a<<<
$ echo ">>>$field2<<<"
>>>b c<<<
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With