Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex replace non-word except dash

Tags:

regex

I have a regex pattern (\W|_)[^-] doesn't work for h_e.l_l.o - w_o.r_d (replacement string is " ").

It returns something like this:

h      w   

I hope to see at least something like this:

h e l l o - w o r d

How can I replace all non-word characters and _ excluding the - symbol?

like image 701
user2648694 Avatar asked Mar 17 '15 09:03

user2648694


1 Answers

To match any non-word char except dash (or hyphen) you may use

[^\w-]

However, this regular expression does not match _.

You need a negated character class that matches any char other than letters, digits and hyphens:

/[^-a-zA-Z0-9]+/

or (with a case insensitive modifier):

/[^-a-z0-9]+/i

See demo.

Note that the - is placed at the character class start, and does not require escaping.

You may add a plus at the end to match all the unwanted characters at a stretch to remove them in one go.

If you want to make your pattern Unicode aware (that is, in some regex flavors, if you use shorthand character classes with/without some flags, they will also match all Unicode counterparts), you may use

/[^\w-]|_/

See the regex demo (or /(?:[^\w-]|_)+/ to grab the whole chunk of these chars).

Here, [^\w-] matches any char that is not a word char (letter, digit, or underscore) and the second alternative _ matches underscores.

like image 117
Wiktor Stribiżew Avatar answered Oct 02 '22 00:10

Wiktor Stribiżew