Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex help NOT a-z or 0-9

I need a regex to find all chars that are NOT a-z or 0-9

I don't know the syntax for the NOT operator in regex.

I want the regex to be NOT [a-z, A-Z, 0-9].

Thanks in advance!

like image 250
s15199d Avatar asked Jun 17 '11 13:06

s15199d


2 Answers

It's ^. Your regex should use [^a-zA-Z0-9]. Beware: this character class may have unexpected behavior with non-ascii locales. For instance, this would match é.

Edited

If the regexes are perl-compatible (PCRE), you can use \s to match all whitespace. This expands to include spaces and other whitespace characters. If they're posix-compatible, use [:space:] character class (like so: [^a-zA-Z0-9[:space:]]). I would recommend using [:alnum:] instead of a-zA-Z0-9.

If you want to match the end of a line, you should include a $ at the end. Turning on multiline mode is only when your match should extend across multiple lines, and it reduces performance for larger files since more must be read into memory.

Why don't you include a copy of sample input, the text you want to match, and the program you are using to do so?

like image 85
Michael Lowman Avatar answered Sep 20 '22 11:09

Michael Lowman


It's pretty simple; you just add ^ at the beginning of a character set to negate that character set.

For example, the following pattern will match everything that's not in that character set -- i.e., not a lowercase ASCII character or a digit:

[^a-z0-9]

As a side note, some of the more helpful Regular Expression resources I've found have been this site and this cheat sheet (C# specific).

like image 32
Donut Avatar answered Sep 20 '22 11:09

Donut