Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace all non-alphanumeric characters in a string with an underscore

Tags:

regex

bash

sed

awk

I want to replace special characters (regex \W) with _ (underscore) But I don't want to replace whitespace with underscore Also replace multiple consecutive special characters with single underscore

Example String: The/Sun is red@ Output: The_Sun is red_

String: .//hack Moon Output: _hack Moon

I have tried echo 'string' | sed 's/\W/_/g' But it's not accurate

like image 392
Solaris Avatar asked Jun 15 '18 13:06

Solaris


3 Answers

sed approach:

s="The/Sun is red@ .//hack Moon"

sed -E 's/[^[:alnum:][:space:]]+/_/g' <<<"$s"
The_Sun is red_ _hack Moon

  • [^[:alnum:][:space:]]+ - match any character sequence except alphanumeric and whitespace
like image 182
RomanPerekhrest Avatar answered Oct 13 '22 13:10

RomanPerekhrest


Use tr for that:

echo "The/Sun is red@" | tr -s -c [:alnum:][:blank:] _

[:alnum:][:blank:] represents alphanumeric characters and whitespace, -c means the opposite of that.

Added: -s to squeeze duplicate underscores into one.

like image 30
hek2mgl Avatar answered Oct 13 '22 14:10

hek2mgl


Just with bash parameter expansion, similar pattern to other answers:

shopt -s extglob
for str in "The/Sun is red@" ".//hack Moon"; do 
    echo "${str//+([^[:alnum:][:blank:]])/_}"
    # .........^^........................^  replace all
    # ...........^^.....................^    one or more
    # .............^^^^^^^^^^^^^^^^^^^^^      non-alnum, non-space character
done
The_Sun is red_
_hack Moon
like image 28
glenn jackman Avatar answered Oct 13 '22 12:10

glenn jackman