Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to exclude non-word Characters but leave spaces

I am trying to write a Regex to stop a use entering invalid characters into a postcode field.

from this link I manged to exclude all "Non-word" characters like so.

Regex regex = new Regex(@"[\W_]+");
string cleanText = regex.Replace(messyText, "").ToUpper();

But this also excludes the "Space" characters.

I am sure this is possible but I find regex very confusing!

Can someone help out with an explanation of the regex pattern used?

like image 589
User1 Avatar asked Jun 13 '17 12:06

User1


2 Answers

You may use character class subtraction:

[\W_-[\s]]+

It matches one or more non-word and underscore symbols with the exception of any whitespace characters.

To exclude just horizontal whitespace characters use [\p{Zs}\t] in the subtraction part:

[\W_-[\p{Zs}\t]]+

To exclude just vertical whitespace characters (line break chars) use [\n\v\f\r\u0085\u2028\u2029] in the subtraction part:

[\W_-[\n\v\f\r\u0085\u2028\u2029]]+
like image 192
Wiktor Stribiżew Avatar answered Oct 24 '22 09:10

Wiktor Stribiżew


You can inverse your character class to make it a negated character class like this:

[^\sa-zA-Z0-9]+

This will match any character except a whitespace or alphanumerical character.

RegEx Demo (as this is not a .NET regex)

like image 3
anubhava Avatar answered Oct 24 '22 10:10

anubhava