Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Javascript string replace with regex to strip off illegal characters

Need a function to strip off a set of illegal character in javascript: |&;$%@"<>()+,

This is a classic problem to be solved with regexes, which means now I have 2 problems.

This is what I've got so far:

var cleanString = dirtyString.replace(/\|&;\$%@"<>\(\)\+,/g, ""); 

I am escaping the regex special chars with a backslash but I am having a hard time trying to understand what's going on.

If I try with single literals in isolation most of them seem to work, but once I put them together in the same regex depending on the order the replace is broken.

i.e. this won't work --> dirtyString.replace(/\|<>/g, ""):

Help appreciated!

like image 971
JohnIdol Avatar asked Sep 23 '10 16:09

JohnIdol


People also ask

How do I remove a specific character from a string in regex?

If you are having a string with special characters and want's to remove/replace them then you can use regex for that. Use this code: Regex. Replace(your String, @"[^0-9a-zA-Z]+", "")

Can regex replace characters?

RegEx makes replace ing strings in JavaScript more effective, powerful, and fun. You're not only restricted to exact characters but patterns and multiple replacements at once.

What is $1 in regex replace?

For example, the replacement pattern $1 indicates that the matched substring is to be replaced by the first captured group.

Is there a regexp escape function in JavaScript?

Escaping / makes the function suitable for escaping characters to be used in a JavaScript regex literal for later evaluation. As there is no downside to escaping either of them, it makes sense to escape to cover wider use cases. And yes, it is a disappointing failing that this is not part of standard JavaScript.


2 Answers

What you need are character classes. In that, you've only to worry about the ], \ and - characters (and ^ if you're placing it straight after the beginning of the character class "[" ).

Syntax: [characters] where characters is a list with characters.

Example:

var cleanString = dirtyString.replace(/[|&;$%@"<>()+,]/g, ""); 
like image 176
Lekensteyn Avatar answered Sep 28 '22 06:09

Lekensteyn


I tend to look at it from the inverse perspective which may be what you intended:

What characters do I want to allow?

This is because there could be lots of characters that make in into a string somehow that blow stuff up that you wouldn't expect.

For example this one only allows for letters and numbers removing groups of invalid characters replacing them with a hypen:

"This¢£«±Ÿ÷could&*()\/<>be!@#$%^bad".replace(/([^a-z0-9]+)/gi, '-'); //Result: "This-could-be-bad" 
like image 29
John Culviner Avatar answered Sep 28 '22 04:09

John Culviner