I am new to regex. I am studying it in regularexperssion.com. The question is that I need to know what is the use of a colon (:) in regular expressions.
For example:
$pattern = '/^(([\w]+:)?\/\/)?(([\d\w]|%[a-fA-f\d]{2,2})+(:([\d\w]|%[a-fA-f\d]{2,2})+)?@)?([\d\w][-\d\w]{0,253}[\d\w]\.)+[\w]{2,4}(:[\d]+)?(\/([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)*(\?(&?([-+_~.\d\w]|%[a-fA-f\d]{2,2})=?)*)?(#([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)?$/'; which matches:
$url1 = "http://www.somewebsite.com"; $url2 = "https://www.somewebsite.com"; $url3 = "https://somewebsite.com"; $url4 = "www.somewebsite.com"; $url5 = "somewebsite.com"; Yeah, any help would be greatly appreciated.
Colon does not have special meaning in a character class and does not need to be escaped.
$ means "Match the end of the string" (the position after the last character in the string).
Thus, if you use a semicolon (;) in a keyword expression, it will split the keywords into multiple parts. Semicolon is not in RegEx standard escape characters. It can be used normally in regular expressions, but it has a different function in HES so it cannot be used in expressions.
Colon : is simply colon. It means nothing, except special cases like, for example, clustering without capturing (also known as a non-capturing group):
(?:pattern) Also it can be used in character classes, for example:
[[:upper:]] However, in your case colon is just a colon.
Special characters used in your regex:
In character class [-+_~.\d\w]:
- means - + means + _ means _ ~ means ~ . means . \d means any digit\w means any word characterThese symbols have this meaning because they are used in a symbol class []. Without symbol class + and . have special meaning.
Other elements:
=? means = that can occur 0 or 1 times; in other words = that can occur or not, optional =.I've decided to go you one better and explain the entire regex:
^ # anchor to start of line ( # start grouping ( # start grouping [\w]+ # at least one of 0-9a-zA-Z_ : # a literal colon ) # end grouping ? # this grouping is optional \/\/ # two literal slashes ) # end capture ? # this grouping is optional ( ( [\d\w] # exactly one of 0-9a-zA-Z_ # having \d is redundant | # alternation % # literal % sign [a-fA-f\d]{2,2} # exactly 2 hexadecimal digits # should probably be A-F # using {2} would have sufficed )+ # at least one of these groups ( # start grouping : # literal colon ( [\d\w] | % [a-fA-f\d]{2,2} )+ )? # Same grouping, but it is optional # and there can be only one @ # literal @ sign )? # this group is optional ( [\d\w] # same as [\w], explained above [-\d\w]{0,253} # includes a dash (-) as a valid character # between 0 and 253 of these characters [\d\w] # end with \w. They want at most 255 # total and - cannot be at the start # or end \. # literal period )+ # at least one of these groups [\w]{2,4} # two to four \w characters ( : # literal colon [\d]+ # at least one digit )? ( \/ # literal slash ( [-+_~.\d\w] # one of these characters | # *or* % # % with two hex digit combo [a-fA-f\d]{2,2} )* # zero or more of these groups )* # zero or more of these groups ( \? # literal question mark ( &? # literal & or & (semicolon optional) ( [-+_~.\d\w] | % [a-fA-f\d]{2,2} ) =? # optional literal = )* # zero or more of this group )? # this group is optional ( # # literal # ( [-+_~.\d\w] | % [a-fA-f\d]{2,2} )* )? $ # anchor to end of line It's important to understand what the metacharacters/sequences are. Some sequences are not meta when used in certain contexts (especially a character class). I've cataloged them for you:
^ -- zero width start of line() -- grouping/capture? -- zero or one of the preceding sequence+ -- one or more of the preceding sequence* -- zero or more of the preceding sequence[] -- character class\w -- alphanumeric characters and _. Opposite of \W | -- alternation{} -- length assertion$ -- zero width end of lineThis excludes :, @, and % from having any special/meta meaning in the raw context.
] ends the character class. - creates a range of characters unless it is at the start or the end of the character class or escaped with a backslash.
A (? combination starts a grouping assertion. For example, (?: means group but do not capture. This means that in the regex /(?:a)/, it will match the string "a", but a is not captured for use in replacement or match groups as it would be from /(a)/.
? can also be used for lookahead/lookbehind assertions with ?=, ?!, ?<=, ?<!. (? followed by any sequence except what I mentioned in this section is just a literal ?.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With