Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to match all https URLs except a certain path

I need a regex that will match all https URLs except for a certain path.

e.g.

Match

https://www.domain.com/blog https://www.domain.com

Do Not Match

https://www.domain.com/forms/*

This is what I have so far:

<rule name="Redirect from HTTPS to HTTP excluding /forms" enabled="true" stopProcessing="true">
    <match url=".*" />
    <conditions>
        <add input="{URL}" pattern="^https://[^/]+(/(?!(forms/|forms$)).*)?$" />
    </conditions>
    <action type="Redirect" url="http://{HTTP_HOST}/{R:0}" redirectType="Permanent" />
</rule>

But it doesn't work

like image 232
Burt Avatar asked Aug 05 '13 20:08

Burt


4 Answers

The way the redirect module works, you should simply use:

<rule name="Redirect from HTTPS to HTTP excluding /forms" stopProcessing="true">
    <match url="^forms/?" negate="true" />
    <conditions>
        <add input="{HTTPS}" pattern="^ON$" />
    </conditions>
    <action type="Redirect" url="http://{HTTP_HOST}/{R:0}" />
</rule>

The rule will trigger the redirect to HTTP only if the request was HTTPS and if the path wasn't starting with forms/ or forms (using the negate="true" option).
You could also add a condition for the host to match www.example.com as following:

<rule name="Redirect from HTTPS to HTTP excluding /forms" stopProcessing="true">
    <match url="^forms/?" negate="true" />
    <conditions>
        <add input="{HTTPS}" pattern="^ON$" />
        <add input="{HTTP_HOST}" pattern="^www.example.com$" />
    </conditions>
    <action type="Redirect" url="http://{HTTP_HOST}/{R:0}" />
</rule>
like image 69
cheesemacfly Avatar answered Nov 03 '22 08:11

cheesemacfly


Does this give you the behavior you're looking for?

https?://[^/]+($|/(?!forms)/?.*$)

After the www.domain.com bit, it's looking for either the end of the string, or for a slash and then something that ISN'T forms.

like image 30
BlairHippo Avatar answered Nov 03 '22 09:11

BlairHippo


I came up with the following pattern: ^https://[^/]+(/(?!form/|form$).*)?$

Explanation:

  • ^ : match begin of string
  • https:// : match https://
  • [^/]+ : match anything except forward slash one or more times
  • ( : start matching group 1
    • / : match /
    • (?! : negative lookahead
      • form/ : check if there is no form/
      • | : or
      • form$ : check if there is no form at the end of the string
    • ) : end negative lookahead
    • .* : match everything zero or more times
  • ) : end matching group 1
  • ? : make the previous token optional
  • $ : match end of line
like image 5
HamZa Avatar answered Nov 03 '22 10:11

HamZa


I see two issues in the posted pattern http://[^/]+($|/(?!forms)/?.*$)

  • It misses redirecting URLs such as https://domain.com/forms_instructions, since the pattern fails to match those also.

  • I believe you have http and https reversed between the pattern and the URL. The pattern should have https and the URL http.

Perhaps this will work as you intend:

 <rule name="Redirect from HTTPS to HTTP excluding /forms" enabled="true" stopProcessing="true">
        <match url="^https://[^/]+(/(?!(forms/|forms$)).*)?$" />
        <action type="Redirect" url="http://{HTTP_HOST}{R:1}" redirectType="Permanent" />
    </rule>

Edit: I've moved the pattern to the tag itself since matching everything with .* and then using an additional condition seems unnecessary. I've also changed the redirection URL to use the part of the input URL captured by the brackets in the match.

like image 3
Sundar R Avatar answered Nov 03 '22 09:11

Sundar R