Regular expression for excluding special characters [closed]

People also ask

How do you restrict special characters in regex?

var nospecial=/^[^* | \ " : < > [ ] { } ` \ ( ) '' ; @ & $]+$/; if(address. match(nospecial)){ alert('Special characters like * | \ " : < > [ ] { } ` \ ( ) \'\' ; @ & $ are not allowed'); return false; but it is not working.

What does '$' mean in regex?

Literal Characters and Sequences For instance, you might need to search for a dollar sign ("$") as part of a price list, or in a computer program as part of a variable name. Since the dollar sign is a metacharacter which means "end of line" in regex, you must escape it with a backslash to use it literally.

What is difference [] and () in regex?

[] denotes a character class. () denotes a capturing group. [a-z0-9] -- One character that is in the range of a-z OR 0-9.

I would just white list the characters.

^[a-zA-Z0-9äöüÄÖÜ]*$

Building a black list is equally simple with regex but you might need to add much more characters - there are a lot of Chinese symbols in unicode ... ;)

^[^<>%$]*$

The expression [^(many characters here)] just matches any character that is not listed.

To exclude certain characters ( <, >, %, and $), you can make a regular expression like this:

[<>%\$]

This regular expression will match all inputs that have a blacklisted character in them. The brackets define a character class, and the \ is necessary before the dollar sign because dollar sign has a special meaning in regular expressions.

To add more characters to the black list, just insert them between the brackets; order does not matter.

According to some Java documentation for regular expressions, you could use the expression like this:

Pattern p = Pattern.compile("[<>%\$]");
Matcher m = p.matcher(unsafeInputString);
if (m.matches())
{
    // Invalid input: reject it, or remove/change the offending characters.
}
else
{
    // Valid input.
}

Even in 2009, it seems too many had a very limited idea of what designing for the WORLDWIDE web involved. In 2015, unless designing for a specific country, a blacklist is the only way to accommodate the vast number of characters that may be valid.

The characters to blacklist then need to be chosen according what is illegal for the purpose for which the data is required.

However, sometimes it pays to break down the requirements, and handle each separately. Here look-ahead is your friend. These are sections bounded by (?=) for positive, and (?!) for negative, and effectively become AND blocks, because when the block is processed, if not failed, the regex processor will begin at the start of the text with the next block. Effectively, each look-ahead block will be preceded by the ^, and if its pattern is greedy, include up to the $. Even the ancient VB6/VBA (Office) 5.5 regex engine supports look-ahead.

So, to build up a full regular expression, start with the look-ahead blocks, then add the blacklisted character block before the final $.

For example, to limit the total numbers of characters, say between 3 and 15 inclusive, start with the positive look-ahead block (?=^.{3,15}$). Note that this needed its own ^ and $ to ensure that it covered all the text.

Now, while you might want to allow _ and -, you may not want to start or end with them, so add the two negative look-ahead blocks, (?!^[_-].+) for starts, and (?!.+[_-]$) for ends.

If you don't want multiple _ and -, add a negative look-ahead block of (?!.*[_-]{2,}). This will also exclude _- and -_ sequences.

If there are no more look-ahead blocks, then add the blacklist block before the $, such as [^<>[\]{\}|\\\/^~%# :;,$%?\0-\cZ]+, where the \0-\cZ excludes null and control characters, including NL (\n) and CR (\r). The final + ensures that all the text is greedily included.

Within the Unicode domain, there may well be other code-points or blocks that need to be excluded as well, but certainly a lot less than all the blocks that would have to be included in a whitelist.

The whole regex of all of the above would then be

(?=^.{3,15}$)(?!^[_-].+)(?!.+[_-]$)(?!.*[_-]{2,})[^<>[\]{}|\\\/^~%# :;,$%?\0-\cZ]+$

which you can check out live on https://regex101.com/, for pcre (php), javascript and python regex engines. I don't know where the java regex fits in those, but you may need to modify the regex to cater for its idiosyncrasies.

If you want to include spaces, but not _, just swap them every where in the regex.

The most useful application for this technique is for the pattern attribute for HTML input fields, where a single expression is required, returning a false for failure, thus making the field invalid, allowing input:invalid css to highlight it, and stopping the form being submitted.

The negated set of everything that is not alphanumeric & underscore for ASCII chars:

/[^\W]/g

For email or username validation i've used the following expression that allows 4 standard special characters - _ . @

/^[-.@_a-z0-9]+$/gi

For a strict alphanumeric only expression use:

/^[a-z0-9]+$/gi

Test @ RegExr.com

Related questions
                            
                                Spring @EnableResourceServer vs @EnableOAuth2Sso
                            
                                Can java call parent overridden method in other objects but not subtype?
                            
                                How can I determine the type of a generic field in Java?
                            
                                why is java.lang.Throwable a class?
                            
                                How to convert list of objects to list of interfaces?
                            
                                ProcessBuilder gives a "No such file or directory" on Mac while Runtime().exec() works fine
                            
                                spannable on android for textView
                            
                                Get group names in java regex
                            
                                How to change fragments using Android navigation drawer
                            
                                Java project with Gradle in IntelliJ IDEA: cannot resolve symbol 'google' but project compiles
                            
                                Correct format string for String.format or similar
                            
                                Spring MVC - @Valid on list of beans in REST service
                            
                                How to do string formatting with placeholders in Java (like in Python)?
                            
                                How to truncate a BigDecimal without rounding
                            
                                How to use JasperReports with Spring MVC?
                            
                                How are NaN and Infinity of a float or double stored in memory?
                            
                                Is there a replacement for the garbage collection JVM args in Java 11?
                            
                                shutdown hook for java web application
                            
                                How can I deserialize the object, if it was moved to another package or renamed?
                            
                                RxJava2 observable take throws UndeliverableException

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Regular expression for excluding special characters [closed]

Tags:

java

regex

People also ask

Recent Activity

Donate For Us