I'm learning bash and I saw this construction:
cat file | while IFS= read -r line; do ... done
Can anyone explain what IFS=
does? I know it's input field separator, but why is it being set to nothing?
The while loop syntax IFS is used to set field separator (default is while space). The -r option to read command disables backslash escaping (e.g., \n, \t). This is failsafe while read loop for reading text files.
The IFS variable is used in shells (Bourne, POSIX, ksh, bash) as the input field separator (or internal field separator). Essentially, it is a string of special characters which are to be treated as delimiters between words/fields when splitting a line of input. The default value of IFS is space, tab, newline.
For many command line interpreters (“shell”) of Unix operating systems, the input field separators variable (abbreviated IFS, and often referred to as internal field separators) refers to a variable which defines the character or characters used to separate a pattern into tokens for some operations.
$(...) is an expression that starts a new subshell, whose expansion is the standard output produced by the commands it runs. This is similar to another command/expression pair in bash : ((...)) is an arithmetic statement, while $((...)) is an arithmetic expression. Follow this answer to receive notifications.
IFS
does many things but you are asking about that particular loop.
The effect in that loop is to preserve leading and trailing white space in line
. To illustrate, first observe with IFS set to nothing:
$ echo " this is a test " | while IFS= read -r line; do echo "=$line=" ; done = this is a test =
The line
variable contains all the white space it received on its stdin. Now, consider the same statement with the default IFS:
$ echo " this is a test " | while read -r line; do echo "=$line=" ; done =this is a test=
In this version, the white space internal to the line is still preserved. But, the leading and trailing white space have been removed.
-r
do in read -r
?The -r
option prevents read
from treating backslash as a special character.
To illustrate, we use two echo commands that supply two lines to the while
loop. Observe what happens with -r
:
$ { echo 'this \\ line is \' ; echo 'continued'; } | while IFS= read -r line; do echo "=$line=" ; done =this \\ line is \= =continued=
Now, observe what happens without -r
:
$ { echo 'this \\ line is \' ; echo 'continued'; } | while IFS= read line; do echo "=$line=" ; done =this \ line is continued=
Without -r
, two changes happened. First, the double-backslash was converted to a single backslash. Second, the backslash on the end of the first line was interpreted as a line-continuation character and the two lines were merged into one.
In sum, if you want backslashes in the input to have special meaning, don't use -r
. If you want backslashes in the input to be taken as plain characters, then use -r
.
Since read
takes input one line at a time, IFS behaves affects each line of multiple line input in the same way that it affects single line input. -r
behaves similarly with the exception that, without -r
, multiple lines can be combined into one line using the trailing backslash as shown above.
The behavior with multiple line input, however, can be changed drastically using read's -d
flag. -d
changes the delimiter character that read
uses to mark the end of an input line. For example, we can terminate lines with a tab character:
$ echo $'line one \n line\t two \n line three\t ends here' line one line two line three ends here $ echo $'line one \n line\t two \n line three\t ends here' | while IFS= read -r -d$'\t' line; do echo "=$line=" ; done =line one line= = two line three=
Here, the $'...'
construct was used to enter special characters like newline, \n
and tab, \t
. Observe that with -d$'\t'
, read
divides its input into "lines" based on tab characters. Anything after the final tab is ignored.
The most important use of the features described above is to process difficult file names. Since the one character that cannot appear in path/filenames is the null character, the null character can be used to separate a list of file names. As an example:
while IFS= read -r -d $'\0' file do # do something to each file done < <(find ~/music -type f -print0)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With