Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does bash ignore newlines when doing for loop over the contents of a C-style string?

Why does the following...

c=0; for i in $'1\n2\n3\n4'; do echo iteration $c :$i:; c=$[c+1]; done

print out...

iteration 0 :1 2 3 4:

and not

iteration 0 :1:
iteration 1 :2:
iteration 2 :3:
iteration 3 :4:

From what I understand, the $'STRING' syntax should allow me to specify a string with escape characters. Shouldn't "\n" be interpreted as newline so that the for loop echos four times, once for each line? Instead, it seems as if the newline is interpreted as a space character.

I took unwind's suggestion and tried setting $IFS. The results were same.

IFS=$'\n'; c=0; for i in $'1\n2\n3\n4'; do echo iteration $c :$i:; c=$[c+1]; done; unset IFS;

iteration 0 :1 2 3 4:

William Purssel says in a comment that this did not work because IFS was being set to newline... but following did not work.

IFS=' '; c=0; for i in '1 2 3 4'; do echo iteration $c :$i:; c=$[c+1]; done; unset IFS;

iteration 0 :1 2 3 4:

Using IFS=' ' on newline-separated string resulted in even more mess...

IFS=' '; c=0; for i in $'1\n2\n3\n4'; do echo iteration $c :$i:; c=$[c+1]; done; unset IFS;

iteration 0 :1
2
3
4:

setting IFS to '\n' rather than $'\n' had the same effect as IFS=' ' ...

IFS='\n'; c=0; for i in $'1\n2\n3\n4'; do echo iteration $c :$i:; c=$[c+1]; done; unset IFS;

iteration 0 :1
2
3
4:

There's only one iteration, but the newline is visible in the echo for some reason.

What did work is first storing the string in a variable then looping over the contents of the variable (without having to set IFS):

c=0; v=$'1\n2\n3\n4'; for i in $v; do echo iteration $c :$i:; c=$[c+1]; done

iteration 0 :1:
iteration 1 :2:
iteration 2 :3:
iteration 3 :4:

Which still does not explain why there is this problem.

Is there a pattern here? Is this the expected behavior of IFS as defined in unwind's link?

unwind's link states... "The shell scans the results of parameter expansion, command substitution, and arithmetic expansion that did not occur within double quotes for word splitting."

I guess that explains why string literals don't get split for for-loop iteration no matter what escape characters are used. Only when the literal is assigned to a variable then that variable is expanded to be split for the for-loop does it work. I guess also with command substitution.

Examples:

Result of command substitution is split

c=0; for i in `echo $'1\n2\n3\n4'`; do echo iteration $c :$i:; c=$[c+1]; done

iteration 0 :1:
iteration 1 :2:
iteration 2 :3:
iteration 3 :4:

Portion of the string that was expanded is split, rest is not.

c=0; v=$'1 \n\t2\t3 4'; for i in $v$'\n5\n6'; do echo iteration $c :$i:; c=$[c+1]; done

iteration 0 :1:
iteration 1 :2:
iteration 2 :3:
iteration 3 :4 5 6:

When expansion happen in double quotes, no splitting occurs.

c=0; v=$'1\n2\n3 4'; for i in "$v"; do echo iteration $c :$i:; c=$[c+1]; done

iteration 0 :1 2 3 4:

Any sequence of SPACE, TAB, NEWLINE is used as delimiter for splitting.

c=0; v=$'1 2\t3 \t\n4'; for i in $v; do echo iteration $c :$i:; c=$[c+1]; done

iteration 0 :1:
iteration 1 :2:
iteration 2 :3:
iteration 3 :4:

I will accept unwind's answer as his link yields the answer to my question.

No clue as to why behavior of echo within for-loop changes with value of IFS.

EDIT: extended to clarify.

like image 228
EMPraptor Avatar asked Oct 30 '09 15:10

EMPraptor


1 Answers

Bash doesn't do word expansion on quoted strings in this context. For example:

$ for i in "a b c d"; do echo $i; done
a b c d

$ for i in a b c d; do echo $i; done
a
b
c
d

$ var="a b c d"; for i in "$var"; do echo $i; done
a b c d

$ var="a b c d"; for i in $var; do echo $i; done
a
b
c
d

In a comment, you stated "IFS='\n' also works. What doesn't work is IFS=$'\n'. I'm very very confused right now."

In IFS='\n', you're setting the separators (plural) to the two characters backslash and "n". So if you do this (inserting an "X" in the middle of a "\n") you see what happens. It's treating the "\n" sequences literally in spite of the fact you have them in $'':

$ IFS='\n'; for i in $'a\Xnb\nc\n'; do echo $i; done; rrifs
a X b
c

Edit 2 (in response to the comment):

It sees '\n' as two characters (not newline) and $'a\Xnb\nc\n' as a literal string of 10 characters (no newlines) then echo outputs the string and interprets the "\n" sequence as a newline (since the string is "marked" for interpretation), but since it's quoted it's seen as one string rather than words delimited by $IFS.

Try these for further comparison:

$ c=0; for i in "a\nb\nc\n"; do echo -e "iteration $c :$i:"; c=$[c+1]; done
iteration 0 :a
b
c
:

$ c=0; for i in "a\nb\nc\n"; do echo "iteration $c :$i:"; c=$[c+1]; done
iteration 0 :a\nb\nc\n:

$ c=0; for i in a\\nb\\nc\\n; do echo -e "iteration $c :$i:"; c=$[c+1]; done
iteration 0 :a
b
c
:

$ c=0; for i in a\\nb\\nc\\n; do echo "iteration $c :$i:"; c=$[c+1]; done
iteration 0 :a\nb\nc\n:

Setting IFS has no effect on the above.

This works (note that $var is unquoted in the for statement):

$ var=$'a\nb\nc\n'
$ saveIFS="$IFS"   # it's important to save and restore $IFS
$ IFS=$'\n'        # set $IFS to a newline using $'\n' (not '\n')
$ c=0; for i in $var; do echo -e "iteration $c :$i:"; c=$[c+1]; done
iteration 0 :a:
iteration 1 :b:
iteration 2 :c:
$ IFS="$saveIFS"
like image 119
Dennis Williamson Avatar answered Sep 28 '22 14:09

Dennis Williamson