Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex for Hebrew, English, Symbols

Tags:

c#

regex

detect

as part of a small program i'm writing i need to filter a String input that might be "gibrish" (any sign in UTF8) the input can be hebrew and/or english but also have all the normal signs like : ?%$!@'_' and so on...

a friend suggested to use regex, but due to my inexperience with using it i come to you for advice.

how can i create a C# function to check an input text and if it's not "right" return false

my try so far is:

public static bool shortTest(string input)
    {
        string pattern = @"^[אבגדהוזחטיכלמנסעפצקרשתץףןםa-zA-Z0-9\_]+$";
        Regex regex = new Regex(pattern);
        return regex.IsMatch(input);
    }

all the chars after "[" and to "a" are hebrew

like image 699
ian Avatar asked May 05 '13 22:05

ian


2 Answers

For Hebrew letters, in C# you can do somthing like that:

return System.Text.RegularExpressions.Regex.IsMatch(value, @"^[א-ת]+$");

enjoy =)

like image 82
oCcSking Avatar answered Nov 09 '22 22:11

oCcSking


You can use the \p{IsHebrew} character class instead of enumerate all hebrew characters, \w for [a-zA-Z0-9_] and \s for spaces, tabs, newlines. You can add too dots, comma... An example :

^[\p{IsHebrew}\w\s,.?!;:-]+$

or

^[\p{IsHebrew}\w\s\p{P}]+$

\p{P} stands for all ponctuation signs (as far i know: .,?!:;-_(){}[]\/'"&#@%*)

like image 30
Casimir et Hippolyte Avatar answered Nov 09 '22 21:11

Casimir et Hippolyte