i got a little trouble using Rexex in Powershell. It seems like there is a imlementation error or something.
The text i want to work with is a html file, which looks like this (Example1):
<span>[Mobile: %mobile% |] Phone: %telephone% [| Fax: %faxNumber%]</span>
<Span>
The Problem is that, caused by html editors, i also may get something like this (Example2):
<span>[Mobile:
%mobile% |] Phone: %telephone% [| Fax: %faxNumber%]</span>
So as you see, we got linebreaks and html escaped, fixed whitespaces
.
My Powershell Regex looks like this:
$x = $x -ireplace '(?ms)\[(.?){7}Fax(.*?)\]', 'MyReplacement1'
and this
$x = $x -ireplace '(?ms)\[(.?){7}Mobile(.*?)\]', 'MyReplacement2'
Basicly The [ marks the beginning of a variable and ] the end of it. Two problems arise from this:
(.?){7}
to allow SOME (here exacly 7) characters and avoid matching the hole part between the first [ near Mobile and the last ] near Fax (which would happen if i would be using (.*?)
instead of (.?){7}
). I'm not sure if there are alternatives so that i can allow ANY number (and not 7) of chars between the starting [ and the variable keyword "Fax" for example. This would be usefull to avoid missmatches when stuff like
gets added (where only 7 char would not be enough and like i said (.*?)
will fail). Hope i was able to explain it (kinda hard) - if not: please feel free to ask!I'm greatfull for any help and even regex recommandations from the pros to avoid any further problems i'm not thinking about right now...
EDIT: (Example3):
<span>[Mobile:
%mobile% |] Phone: %telephone% [| Fax:
%faxNumber%]</span>
The trick around DotAll mode is to use [\s\S]
instead of .
. This character class matches any character (because it matches space and non-space characters). (As does [\w\W]
or [\d\D]
, but the spaces seem to be kind of a convention.)
To get around the 7
you can simply disallow closing ]
before the one you actually want to match (that by the way also makes DotAll unnecessary). So something like this should work fine for you:
\[([^\]:]*)Fax([^\]]*)\]
It looks a bit ugly, but it simply means this:
\[ # literal [
( # capturing group 1
[^\]:]* # match as many non-:, non-] characters as possible
) # end of group 1
Fax # literal Fax
( # capturing group 2
[^\]]* # match as many non-] characters as possible
) # end of group 2
\] # literal ]
Further reading on character classes.
Note that none of these patterns need multiline mode m
(neither yours nor mine), because all it does is make ^
and $
match line beginnings and endings, respectively. But none of the patterns contain these meta-characters. So the modifier does not do anything.
My console output:
PS> $x = "<span>[Mobile: %mobile% |] Phone: %telephone% [| Fax: %faxNumber%]</span>"
PS> $x -ireplace '\[([^\]:]*)Mobile([^\]]*)\]', 'MyReplacement1'
<span>MyReplacement1 Phone: %telephone% [| Fax: %faxNumber%]</span>
PS> $x -ireplace '\[([^\]:]*)Fax([^\]]*)\]', 'MyReplacement2'
<span>[Mobile: %mobile% |] Phone: %telephone% MyReplacement2</span>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With