I have this HTML: <pre class="prettyprint"><code>"This is simple html text simple simple text text text" </code></pre> I need to match only words that are outside any HTML tag. I mean if I want to match “simple” and “text” I should get the results only from “This is simple html text” and the last part “text”—the result will be “simple” 1 match, “text” 2 matches. Could anyone help me with this? I’m using jQuery. <pre class="prettyprint"><code>var pattern = new RegExp("(\\b" + value + "\\b)", 'gi'); if (pattern.test(text)) { text = text.replace(pattern, "$1"); } </code></pre> <ul> <li> <code>value</code> is the word I want to match (in this case “simple”)</li> <li> <code>text</code> is <code>"This is simple html text simple simple text text text"</code> </li> </ul> I need to wrap all selected words (in this example it is “simple”) with <code></code>. But I want to wrap only words that are outside any HTML tags. The result of this example should be <pre class="prettyprint"><code>This is simple html text simple simple text text text </code></pre> I do not want replace any text inside <pre class="prettyprint"><code>simple simple text text </code></pre> It should be the same as before replacement.

Okay, try using this regex: <pre class="prettyprint"><code>(text|simple)(?![^<]*>|[^<>]*</) </code></pre> Example worked on regex101. Breakdown: <pre class="prettyprint"><code>( # Open capture group text # Match 'text' | # Or simple # Match 'simple' ) # End capture group (?! # Negative lookahead start (will cause match to fail if contents match) [^<]* # Any number of non-'<' characters > # A > character | # Or [^<>]* # Any number of non-'<' and non-'>' characters </ # The characters < and / ) # End negative lookahead. </code></pre> The negative lookahead will prevent a match if <code>text</code> or <code>simple</code> is between html tags.

<pre class="prettyprint"><code>^([^<]*)<\w+.*/\w+>([^<]*)$ </code></pre> However this is a very naive expression. It would be better to use a DOM parser.

Regex replace text outside html tags

I have this HTML:

"This is simple html text <span class='simple'>simple simple text text</span> text"

I need to match only words that are outside any HTML tag. I mean if I want to match “simple” and “text” I should get the results only from “This is simple html text” and the last part “text”—the result will be “simple” 1 match, “text” 2 matches. Could anyone help me with this? I’m using jQuery.

var pattern = new RegExp("(\\b" + value + "\\b)", 'gi');

if (pattern.test(text)) {
    text = text.replace(pattern, "<span class='notranslate'>$1</span>");
}

value is the word I want to match (in this case “simple”)
text is "This is simple html text simple simple text text text"

I need to wrap all selected words (in this example it is “simple”) with . But I want to wrap only words that are outside any HTML tags. The result of this example should be

This is <span class='notranslate'>simple</span> html <span class='notranslate'>text</span> <span class='simple'>simple simple text text</span> <span class='notranslate'>text</span>

I do not want replace any text inside

<span class='simple'>simple simple text text</span>

It should be the same as before replacement.

Can you replace text with regex?

Find and replace text using regular expressions When you want to search and replace specific patterns of text, use regular expressions. They can help you in pattern matching, parsing, filtering of results, and so on. Once you learn the regex syntax, you can use it for almost any language.

How do you replace a word in regex?

To use RegEx, the first argument of replace will be replaced with regex syntax, for example /regex/ . This syntax serves as a pattern where any parts of the string that match it will be replaced with the new substring. The string 3foobar4 matches the regex /\d. *\d/ , so it is replaced.

What is regex in replace?

The Regex. Replace(String, String, MatchEvaluator, RegexOptions) method is useful for replacing a regular expression match if any of the following conditions is true: If the replacement string cannot readily be specified by a regular expression replacement pattern.

Can you use regex in a HTML document?

While arbitrary HTML with only a regex is impossible, it's sometimes appropriate to use them for parsing a limited, known set of HTML. If you have a small set of HTML pages that you want to scrape data from and then stuff into a database, regexes might work fine.

Okay, try using this regex:

(text|simple)(?![^<]*>|[^<>]*</)

Example worked on regex101.

Breakdown:

(         # Open capture group
  text    # Match 'text'
|         # Or
  simple  # Match 'simple'
)         # End capture group
(?!       # Negative lookahead start (will cause match to fail if contents match)
  [^<]*   # Any number of non-'<' characters
  >       # A > character
|         # Or
  [^<>]*  # Any number of non-'<' and non-'>' characters
  </      # The characters < and /
)         # End negative lookahead.

The negative lookahead will prevent a match if text or simple is between html tags.

^([^<]*)<\w+.*/\w+>([^<]*)$

However this is a very naive expression. It would be better to use a DOM parser.

Regex replace text outside html tags

Tags:

html

regex

replace

Sanya530

People also ask

2 Answers

Jerry

Explosion Pills

Recent Activity

Donate For Us

Regex replace text outside html tags

Tags:

html

regex

replace

Sanya530

People also ask

2 Answers

Jerry

Explosion Pills

Related questions

Recent Activity

Donate For Us