Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove all lowercase characters from a string using AWK?

Tags:

regex

linux

awk

Please note that I need this answer in AWK.

How can I remove all lowercase characters from some awk variable? I tried calling gsub:

gsub(/[a-z]+/,"",varName);

Unfortunately, that removes the whole string, as if awk cannot tell the difference between lower and upper case. Is there some regex-fu I can use that I'm not aware of?

EDIT: Confirmed, awk does not see the difference between lowercase and uppercase characters.

Example 1 (will use letter f here for better understanding of results):

varName="CHRFProtocol";
gsub(/[a-z]/,"f",varName);

Result: ffffffffffff

Example 2 (again, will use letter f here for better understanding of results):

varName="CHRFProtocol";
gsub(/[A-Z]/,"f",varName);

Result: ffffffffffff

Is this legitimate? What's doing on?

like image 587
IDDQD Avatar asked Apr 12 '26 11:04

IDDQD


1 Answers

Your locale settings are getting in the way. Try this:

LC_ALL=C awk 'BEGIN { 
varName="CHRFProtocol";
gsub(/[a-z]/,"f",varName);
print(varName); }'

GNU awk honors locale settings, and in most national locales on Linux, regular expressions are case-insensitive. Resetting the locale to C (=POSIX) for the duration of the awk command restores case-sensitivity.

like image 68
Mark Reed Avatar answered Apr 15 '26 02:04

Mark Reed



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!