Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular expression, selects a portion of text inside other

I am a bit clueless about the next task. I wish to select a text between " that its inside a tag but not outside of the tag,i.e. a selection inside another selection.

I have the next tag: <| and |> and i want to select a text only if its between the " and between the tags.

<| blah blah blah "should be selected" not selected "select it too" |> "not selected too"

I think something about

(\<\|)(\").*?(\")(\|\>)   

But it doesn't work.

like image 788
magallanes Avatar asked Dec 20 '15 13:12

magallanes


People also ask

What does '$' mean in regex?

$ means "Match the end of the string" (the position after the last character in the string). Both are called anchors and ensure that the entire string is matched instead of just a substring.

What is difference [] and () in regex?

[] denotes a character class. () denotes a capturing group. [a-z0-9] -- One character that is in the range of a-z OR 0-9. (a-z0-9) -- Explicit capture of a-z0-9 .

What does \d mean in regex?

\d (digit) matches any single digit (same as [0-9] ). The uppercase counterpart \D (non-digit) matches any single character that is not a digit (same as [^0-9] ). \s (space) matches any single whitespace (same as [ \t\n\r\f] , blank, tab, newline, carriage-return and form-feed).

What does (? I do in regex?

(? i) makes the regex case insensitive. (? c) makes the regex case sensitive.


2 Answers

I've got it to match correctly using two regexes.

var input = '<|a "b"|>c "d"ef<|"g"h "i"|>"j"k l';
var output=input.match(/<\|(.*?)\|>/g)
   .map(function(x){return x.match(/"(.*?)"/g)})
alert(output)

As you can see, correctly matches "b","g","i".

The principle:

  1. find all the matches of text between <| and |>
  2. for every match from the first step, find matches of text between two quotes.

(used the regex from the second answer from the linked question)

like image 173
nicael Avatar answered Oct 19 '22 15:10

nicael


This will do the job in a single regex:

(?<=<\|[^>]*)"[^"]*"

In addition to a comment of nicael: It might be possible that the input string is not tagged correctly. This will help:

(?<=<\|((?!\|>).)*)"[^"]*"

If you need to use it with JavaScript:

(?=("[^"]*"[^"]*)*$)"[^"]*"(?=((?!<\|).)*\|>)

like image 28
Sebastian Schumann Avatar answered Oct 19 '22 16:10

Sebastian Schumann