Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove duplicate character at the end of a string in Regex

Can anyone help me with the following regex

<script type="text/javascript">
        function quoteWords() {
            var search = document.getElementById("search_box");
            search.value = search.value.replace(/^\s*|\s*$/g, ""); //trim string of ending and beginning whitespace
            if(search.value.indexOf(" ") != -1){ //if more then one word
                search.value = search.value.replace(/^"*|"*$/g, "\"");
            }
        }
  </script>

<input type="text" name="keywords" value="" id="search_box" size="17">
<input onClick="quoteWords()" type="submit" value="Go">

Issue : It breaks when manually adding double quotes and pressing submit, one extra double quote is entered at the end. The regex code should see if the double quotes exist, it should not add any thing.

So it makes "long enough" to "long enough"" <- it adds an extra double quote at the end

Can anyone check the regex code so see how to solve this issue.

I only want the double quotes to be inserted once.

like image 709
Ibn Saeed Avatar asked Dec 08 '25 20:12

Ibn Saeed


1 Answers

The error is definitely happening in this line:

search.value = search.value.replace(/^"*|"*$/g, "\"");

And it is due to the fact that "* matches 0 or more quotes. However, you presumably wouldn't want to just replace it with "+ since that wouldn't do the job you wanted of double-quoting strings with spaces in them.

You probably just want to do something like this, in two statements:

search.value = search.value.replace(/^"*|"*$/g, '')
search.value = '"' + search.value + '"'

Part of the key is that there is no 'end of string' character to consume - the regex engine 'just knows' when it is at the end of the string. So after matching a quote at the end of the string, the cursor just moves to the end of the string, and it finds the empty string one more time before falling off the string. Thus, the quote at the end of the string is replaced by a quote, and the 'nothing' at the end of the string is also replaced by a quote.

I recommend taking a look at the ECMAScript spec at http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-262.pdf sections 15.5.4.10 and 15.5.4.11 yourself. However, I've also provided an intuitive illustration of how this works at this gist.

EDIT:

Since people seem confused as to why this would happen, here's something that might help:

http://www.grymoire.com/Unix/Sed.html#uh-6

That's from the documentation for sed, but it explains why combining * and /g is a bad idea. The fact that JS doesn't just explode when you do that is a mark in its favor. Note that there are an infinite number of '0 characters' at every position in the string.

like image 62
Tamzin Blake Avatar answered Dec 11 '25 08:12

Tamzin Blake



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!