Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why these simple shell commands fail when used in sed's replacement part

While trying to find an answer of this sed question I came up with a strange behavior that I couldn't understand.

Let's say I have a file called data

$> cat data
foo.png
abCd.png
bar.png
baZ.png

The task is to use sed in line to replace all the lines with uppercase ASCII characters to lowercase. So the output should be:

$> cat data
foo.png
abcd.png
bar.png
baz.png

The solution should work on non-gnu sed also like sed on Mac

I attempted this embedded awk into sed's replacement part:

sed -E 's/[^ ]*[A-Z][^ ]*.png/'$(echo \&|awk '{printf("<%s>[%s]",$0, tolower($0))}')'/' data

Strangely this outputs this:

foo.png
<abCd.png>[abCd.png]
bar.png
<baZ.png>[baZ.png]

As you can see sed is picking up right lines with uppercase alphabets, and that's reaching to awk also but tolower() function of awk is failing and producing same text as input.

Can a shell expert please explain this weird behavior.

like image 794
anubhava Avatar asked May 21 '13 15:05

anubhava


3 Answers

Your awk command runs before the sed command, not as a subprocess of the sed command, so awk is only receiving a literal ampersand as its input, as a result of which it outputs

<&>[&]

This string is then embedded in the string which sed receives as its argument, from which it should be fairly obvious why sed produces the output that it does.

The sequence of events is

  1. The shell sees this command line

    sed -E 's/[^ ]*[A-Z][^ ]*.png/'$(echo \&|awk '{printf("<%s>[%s]",$0, tolower($0))}')'/' data
    
  2. It processes the command substitution (in which awk turns & into <&>[&]), to produce the intermediate command line

    sed -E 's/[^ ]*[A-Z][^ ]*.png/'<&>[&]'/' data
    
  3. The shell then executes sed with the command s/[^ ]*[A-Z][^ ]*.png/<&>[&]/

like image 142
chepner Avatar answered Oct 11 '22 07:10

chepner


sed 'y/ABCDEFGHIJKLMNOPQRSYUVWXYZ/abcdefghijklmnopqrstuvwxyz/'
like image 32
Bruce Barnett Avatar answered Oct 11 '22 07:10

Bruce Barnett


Perhaps tr is what you're really looking for?

tr A-Z a-z file

The sed equivalent would be:

sed -e 'y/ABCDEFGHIJKLMNOPQRSTUVWXYZ/abcdefghijklmnopqrstuvwxyz/'

It doesn't appear that you can use the character range notation (A-Z and/or [A-Z]), which is unfortunate and annoying.

like image 3
twalberg Avatar answered Oct 11 '22 09:10

twalberg