Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does (?ms) in Regex mean?

I have following Regex in Powershell :

[regex]$regex = 
@'
(?ms).*?<DIV class=row>.*?
'@

What does (?ms) mean here.

like image 545
Powershel Avatar asked Dec 28 '14 20:12

Powershel


People also ask

What does regex (? S match?

i) makes the regex case insensitive. (? s) for "single line mode" makes the dot match all characters, including line breaks.

What does M in regex mean?

The m flag is used to specify that a multiline input string should be treated as multiple lines. If the m flag is used, ^ and $ match at the start or end of any line within the input string instead of the start or end of the entire string.

What does this mean in regex \\ s *?

\\s*,\\s* It says zero or more occurrence of whitespace characters, followed by a comma and then followed by zero or more occurrence of whitespace characters. These are called short hand expressions. You can find similar regex in this site: http://www.regular-expressions.info/shorthand.html.

What is S and W in regex?

On the other hand, the \S+ (uppercase S ) matches anything that is NOT matched by \s , i.e., non-whitespace. In regex, the uppercase metacharacter denotes the inverse of the lowercase counterpart, for example, \w for word character and \W for non-word character; \d for digit and \D or non-digit.


1 Answers

(?m) is the modifier for multi-line mode. It makes ^ and $ match the beginning and end of a line, respectively, instead of matching the beginning and end of the input.

For example, given the input:

ABC DEF
GHI

The regex ^[A-Z]{3} will match:

  1. "ABC"

Meanwhile, the regex (?m)^[A-Z]{3} will match:

  1. "ABC"
  2. "GHI"

(?s) is the modifier for single-line mode. It adds linebreaks and newlines to the list of characters that . will match.

Given the same input as before, the regex [A-Z]{3}. will match (note the inclusion of the space character):

  1. "ABC "

While the regex (?s)[A-Z]{3}. will match:

  1. "ABC "
  2. "DEF\n"

Despite their names, the two modes aren't necessarily mutually exclusive. In some implementations they cancel out, but, for the most part, they can be used in concert. You can use both at once by writing (?m)(?s) or, in shorter form, (?ms).

EDIT:

There are certain situations where you might want to use (?ms). The following examples are a bit contrived, but I think they serve our purpose. Given the input (note the space after "ABC"):

ABC
DEF
GHI

The regex (?ms)^[A-Z]{3}. matches:

  1. "ABC "
  2. "DEF\n"

While both (?m)^[A-Z]{3}. and (?s)^[A-Z]{3}. match:

  1. "ABC "
like image 104
Tutleman Avatar answered Oct 22 '22 05:10

Tutleman