Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find out if string contains non-alpha numeric characters in C#/.NET 2.0?

Tags:

string

c#

regex

Allowed characters are (at least) A-Z, a-z, 0-9, ö, Ö, ä, ä, å, Å and german, latvian, estonian (if any) special chars? Is there ready-made method or do i have to make blacklist (non-allowed chars) and regular expressions IsMatch? If no ready-made how to use blacklist?

like image 240
char m Avatar asked Jun 17 '10 12:06

char m


People also ask

How do you know if a string contains non-alphanumeric characters?

*[^a-zA-Z0-9]. *$ tests for any character other than a-z, A-Z and 0-9. Thus if it finds any character other than these, it returns true(means non-alphanumeric character).

How do I check if a string contains alpha numeric?

Python string isalnum() function returns True if it's made of alphanumeric characters only. A character is alphanumeric if it's either an alpha or a number. If the string is empty, then isalnum() returns False .

How do you remove a non alpha character from a string?

A common solution to remove all non-alphanumeric characters from a String is with regular expressions. The idea is to use the regular expression [^A-Za-z0-9] to retain only alphanumeric characters in the string. You can also use [^\w] regular expression, which is equivalent to [^a-zA-Z_0-9] .

What is a non numeric alpha character?

Non-Alphanumeric characters are the other characters on your keyboard that aren't letters or numbers, e.g. commas, brackets, space, asterisk and so on. Any character that is not a number or letter (in upper or lower case) is non-alphanumeric.


3 Answers

I don't know how special characters from all those languages are categorised, but you could check if the Char.IsLetterOrDigit method matches what you want to do. It works at least for the digits and letters I tested:

string test = "Aasdf345ÅÄÖåäöéÉóÓüÜïÏôÔ";
if (test.All(Char.IsLetterOrDigit)) { ... }

The Char.IsLetterOrDigit returns true for characters that are categorised in Unicode as UppercaseLetter, LowercaseLetter, TitlecaseLetter, ModifierLetter, OtherLetter, or DecimalDigitNumber.

like image 130
Guffa Avatar answered Oct 15 '22 02:10

Guffa


Investigate char.IsLetterOrDigit(char).

For example:

myString.All(c => char.IsLetterOrDigit(c));
like image 20
Flynn1179 Avatar answered Oct 15 '22 03:10

Flynn1179


A blacklist for characters is likely pretty large :-)

You can use the regular expression

^[\d\p{L}]+$

to match decimal digits and letters, regardless of script.

This regular expression consists of a character class containing the shorthands \d – which contains every digit (230 in total in the BMP) and \p{L} which contains every Unicode character classified as a "letter" (46817 in the BMP). Said character class is then repeated at least once and embedded between ^ and $ – the string start and end anchors, so it matches the complete string.

For some regex engines, since you're only interested in Latin letters, apparently, you could also use

^[\d\p{Letter}]+$

However, .NET doesn't support this. The first regex mentioned above actually catches everything that's a digit or a letter in any script. So it will dutifully match on Indian or Arabic numerals and Hebrew, Cyrillic and other non-Latin scripts. Depending on what you want this may not be appropriate.

If that poses a problem, then I see no better option than to explicitly list the characters you want to allow. However, I consider it dangerous to assume that text in a certain language is always restricted to that language's script. If I were to write a Czech or Polish name in a German text, then I'd likely need more than just [a-zA-ZäöüÄÖÜß].

like image 35
Joey Avatar answered Oct 15 '22 02:10

Joey