I'm using JavaScript to do some regular expression. Considering I'm working with well-formed source, and I want to remove any space before[,.] and keep only one space after [,.], except that [,.] is part of a number. Thus I use:
text = text.replace(/ *(,|\.) *([^ 0-9])/g, '$1 $2');
The problem is that this replaces also text in the html tag attributes. For example my text is (always wrapped with a tag):
<p>Test,and test . Again <img src="xyz.jpg"> ...</p>
Now it adds a space like this src="xyz. jpg"
that is not expected. How can I rewrite my regular expression? What I want is
<p>Test, and test. Again <img src="xyz.jpg"> ...</p>
Thanks!
How to use RegEx with . replace in JavaScript. To use RegEx, the first argument of replace will be replaced with regex syntax, for example /regex/ . This syntax serves as a pattern where any parts of the string that match it will be replaced with the new substring.
For example, the replacement pattern $1 indicates that the matched substring is to be replaced by the first captured group. For more information about numbered capturing groups, see Grouping Constructs.
You can use a lookahead to make sure the match isn't occurring inside a tag:
text = text.replace(/(?![^<>]*>) *([.,]) *([^ \d])/g, '$1 $2');
The usual warnings apply regarding CDATA sections, SGML comments, SCRIPT elements, and angle brackets in attribute values. But I suspect your real problems will arise from the vagaries of "plain" text; HTML's not even in the same league. :D
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With