Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does IFS= do in this bash loop: `cat file | while IFS= read -r line; do ... done`

Tags:

bash

ifs

I'm learning bash and I saw this construction:

cat file | while IFS= read -r line; do     ... done 

Can anyone explain what IFS= does? I know it's input field separator, but why is it being set to nothing?

like image 896
bodacydo Avatar asked Oct 21 '14 05:10

bodacydo


People also ask

What is IFS in while loop?

The while loop syntax IFS is used to set field separator (default is while space). The -r option to read command disables backslash escaping (e.g., \n, \t). This is failsafe while read loop for reading text files.

What does IFS mean in bash?

The IFS variable is used in shells (Bourne, POSIX, ksh, bash) as the input field separator (or internal field separator). Essentially, it is a string of special characters which are to be treated as delimiters between words/fields when splitting a line of input. The default value of IFS is space, tab, newline.

What does IFS mean in shell script?

For many command line interpreters (“shell”) of Unix operating systems, the input field separators variable (abbreviated IFS, and often referred to as internal field separators) refers to a variable which defines the character or characters used to separate a pattern into tokens for some operations.

What are $( and $(( )) in bash?

$(...) is an expression that starts a new subshell, whose expansion is the standard output produced by the commands it runs. This is similar to another command/expression pair in bash : ((...)) is an arithmetic statement, while $((...)) is an arithmetic expression. Follow this answer to receive notifications.


1 Answers

IFS does many things but you are asking about that particular loop.

The effect in that loop is to preserve leading and trailing white space in line. To illustrate, first observe with IFS set to nothing:

$ echo " this   is a test " | while IFS= read -r line; do echo "=$line=" ; done = this   is a test = 

The line variable contains all the white space it received on its stdin. Now, consider the same statement with the default IFS:

$ echo " this   is a test " | while read -r line; do echo "=$line=" ; done =this   is a test= 

In this version, the white space internal to the line is still preserved. But, the leading and trailing white space have been removed.

What does -r do in read -r?

The -r option prevents read from treating backslash as a special character.

To illustrate, we use two echo commands that supply two lines to the while loop. Observe what happens with -r:

$ { echo 'this \\ line is \' ; echo 'continued'; } | while IFS= read -r line; do echo "=$line=" ; done =this \\ line is \= =continued= 

Now, observe what happens without -r:

$ { echo 'this \\ line is \' ; echo 'continued'; } | while IFS= read line; do echo "=$line=" ; done =this \ line is continued= 

Without -r, two changes happened. First, the double-backslash was converted to a single backslash. Second, the backslash on the end of the first line was interpreted as a line-continuation character and the two lines were merged into one.

In sum, if you want backslashes in the input to have special meaning, don't use -r. If you want backslashes in the input to be taken as plain characters, then use -r.

Multiple lines of input

Since read takes input one line at a time, IFS behaves affects each line of multiple line input in the same way that it affects single line input. -r behaves similarly with the exception that, without -r, multiple lines can be combined into one line using the trailing backslash as shown above.

The behavior with multiple line input, however, can be changed drastically using read's -d flag. -d changes the delimiter character that read uses to mark the end of an input line. For example, we can terminate lines with a tab character:

$ echo $'line one \n line\t two \n line three\t ends here' line one   line    two   line three      ends here $ echo $'line one \n line\t two \n line three\t ends here' | while IFS= read -r -d$'\t' line; do echo "=$line=" ; done =line one   line= = two   line three= 

Here, the $'...' construct was used to enter special characters like newline, \n and tab, \t. Observe that with -d$'\t', read divides its input into "lines" based on tab characters. Anything after the final tab is ignored.

How to handle the most difficult file names

The most important use of the features described above is to process difficult file names. Since the one character that cannot appear in path/filenames is the null character, the null character can be used to separate a list of file names. As an example:

while IFS= read -r -d $'\0' file do     # do something to each file done < <(find ~/music -type f -print0) 
like image 102
John1024 Avatar answered Sep 22 '22 03:09

John1024