Writing a simple regex, but I've never been very good at this.
What I'm trying to do is check a string (filename) to make sure it only contains a-z, A-Z, 0-9 or the special characters underscore (_) period (.) or dash (-).
Here's what I have
if(filename.length() < 1 || !filename.matches("^[a-zA-Z0-9[.][_][-]]+"))
return false;
else
return true;
This appears to work, but does not look very elegant to me. Is there a better / more readable way to write this?
Thanks in advance! Just trying to learn how to write these buggers better.
-Will
Most characters, including all letters ( a-z and A-Z ) and digits ( 0-9 ), match itself. For example, the regex x matches substring "x" ; z matches "z" ; and 9 matches "9" . Non-alphanumeric characters without special meaning in regex also matches itself. For example, = matches "=" ; @ matches "@" .
Using character sets For example, the regular expression "[ A-Za-z] " specifies to match any single uppercase or lowercase letter. In the character set, a hyphen indicates a range of characters, for example [A-Z] will match any one capital letter.
*$ means - match, from beginning to end, any character that appears zero or more times. Basically, that means - match everything from start to end of the string. This regex pattern is not very useful. Let's take a regex pattern that may be a bit useful.
The character class [a-zA-Z] matches any character from a to z or A to Z.
You don't need to use []
inside character class.
So, you can write:
^[-a-zA-Z0-9._]+
Also, you can use \\w
instead of a-zA-Z0-9_
.
So, the regexp would be:
^[-\\w.]+
Also, this regexp will match a string like StackOverflow 22.10$$2011
by consuming StackOverflow 22.10
. If you need your string to consist completely of those character, you should end the pattern with $
- the end of the string:
^[-\\w.]+$
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With