Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular Expression: Allow letters, numbers, and spaces (with at least one letter or number)

I'm currently using this regex ^[A-Z0-9 _]*$ to accept letters, numbers, spaces and underscores. I need to modify it to require at least one number or letter somewhere in the string. Any help would be appreciated!

This would be for validating usernames for my website. I'd actually like to support as many characters as I can, but just want to ensure that I prevent code injection and that characters will display fine for all users. So I'm definitely open to regex validation suggestions that would support a wider set of characters.

like image 260
makeee Avatar asked Feb 23 '09 00:02

makeee


People also ask

How do you put a space in a regular expression?

\s stands for “whitespace character”. Again, which characters this actually includes, depends on the regex flavor. In all flavors discussed in this tutorial, it includes [ \t\r\n\f]. That is: \s matches a space, a tab, a carriage return, a line feed, or a form feed.

Can regular expressions have spaces?

The most common forms of whitespace you will use with regular expressions are the space (␣), the tab (\t), the new line (\n) and the carriage return (\r) (useful in Windows environments), and these special characters match each of their respective whitespaces.

What do you use in a regular expression to match any 1 character or space?

Use square brackets [] to match any characters in a set. Use \w to match any single alphanumeric character: 0-9 , a-z , A-Z , and _ (underscore). Use \d to match any single digit. Use \s to match any single whitespace character.

What does \d do in regex?

\d (digit) matches any single digit (same as [0-9] ). The uppercase counterpart \D (non-digit) matches any single character that is not a digit (same as [^0-9] ). \s (space) matches any single whitespace (same as [ \t\n\r\f] , blank, tab, newline, carriage-return and form-feed).


3 Answers

You simply need to specify your current RE, followed by a letter/number followed by your current RE again:

^[A-Z0-9 _]*[A-Z0-9][A-Z0-9 _]*$ 

Since you've now stated they're Javascript REs, there's a useful site here where you can test the RE against input data.

If you want lowercase letters as well:

^[A-Za-z0-9 _]*[A-Za-z0-9][A-Za-z0-9 _]*$ 
like image 137
paxdiablo Avatar answered Sep 20 '22 21:09

paxdiablo


To go ahead and get a point out there, instead of repeatedly using these:

[A-Za-z0-9 _] [A-Za-z0-9] 

I have two (hopefully better) replacements for those two:

[\w ] [^\W_] 

The first one matches any word character (alphanumeric and _, as well as Unicode) and the space. The second matches anything that isn't a non-word character or an underscore (alphanumeric only, as well as Unicode).

If you don't want Unicode matching, then stick with the other answers. But these just look easier on the eyes (in my opinion). Taking the "preferred" answer as of this writing and using the shorter regexes gives us:

^[\w ]*[^\W_][\w ]*$ 

Perhaps more readable, perhaps less. Certainly shorter. Your choice.

EDIT:

Just as a note, I am assuming Perl-style regexes here. Your regex engine may or may not support things like \w and \W.

EDIT 2:

Tested mine with the JS regex tester that someone linked to and some basic examples worked fine. Didn't do anything extensive, just wanted to make sure that \w and \W worked fine in JS.

EDIT 3:

Having tried to test some Unicode with the JS regex tester site, I've discovered the problem: that page uses ISO instead of Unicode. No wonder my Japanese input didn't match. Oh well, that shouldn't be difficult to fix:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8"> 

Or so. I don't know what should be done as far as JavaScript, but I'm sure it's not hard.

like image 26
Chris Lutz Avatar answered Sep 18 '22 21:09

Chris Lutz


^[ _]*[A-Z0-9][A-Z0-9 _]*$

You can optionally have some spaces or underscores up front, then you need one letter or number, and then an arbitrary number of numbers, letters, spaces or underscores after that.

Something that contains only spaces and underscores will fail the [A-Z0-9] portion.

like image 26
Daniel LeCheminant Avatar answered Sep 18 '22 21:09

Daniel LeCheminant