Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Exclude the last character of a regex match

Tags:

regex

I have the following regex:

%(?:\\.|[^%\\ ])*%([,;\\\s]) 

That works great but obviously it also highlights the next character to the last %.

I was wondering how could I exclude it from the regex?

For instance, if I have:

The files under users\%username%\desktop\ are:

It will highlight %username%\ but I just want %username%. On the other hand, if I leave the regex like this:

%(?:\\.|[^%\\ ])*%

...then it will match this pattern that I don't want to:

%example1%example2%example3

Any idea how to exclude the last character in the match through a regex?

like image 593
user3587624 Avatar asked Nov 12 '15 21:11

user3587624


People also ask

How do you omit a character in regex?

To match any character except a list of excluded characters, put the excluded charaters between [^ and ] . The caret ^ must immediately follow the [ or else it stands for just itself. The character '.

How do I remove the last character of a string?

The easiest way is to use the built-in substring() method of the String class. In order to remove the last character of a given String, we have to use two parameters: 0 as the starting index, and the index of the penultimate character.

What does '$' mean in regex?

$ means "Match the end of the string" (the position after the last character in the string).

What does regex 0 * 1 * 0 * 1 * Mean?

Basically (0+1)* mathes any sequence of ones and zeroes. So, in your example (0+1)*1(0+1)* should match any sequence that has 1. It would not match 000 , but it would match 010 , 1 , 111 etc. (0+1) means 0 OR 1.


2 Answers

%(?:\\.|[^%\\ ])*%(?=[,;\\\s])

                   ^^

Use a lookahead.What you need here is 0 width assertion which does not capture anything.

like image 52
vks Avatar answered Sep 23 '22 17:09

vks


You can use a more effecient regex than you are currently using. When alternation is used together with a quantifier, there is unnecessary backtracking involved.

If the strings you have are short, it is OK to use. However, if they can be a bit longer, you may need to "unroll" the expression.

Here is how it is done:

%[^"\\%]*(?:\\.[^"\\%]*)*%

Regex breakdown:

  • % - initial percentage sign
  • [^"\\%]* - start of the unrolled pattern: 0 or more characters other than a double quote, backslash and percentage sign
  • (?:\\.[^"\\%]*)* - 0 or more sequences of...
    • \\. - a literal backslash followed by any character other than a newline
    • [^"\\%]* - 0 or more characters other than a double quote, backslash and percentage sign
  • % - trailing percentage sign

See this demo - 6 steps vs. 30 steps with your %(?:\\.|[^" %\d\\])*%.

like image 23
Wiktor Stribiżew Avatar answered Sep 23 '22 17:09

Wiktor Stribiżew