Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex to match alphanumeric characters, underscore, periods and dash, allowing dot and dash only in the middle

Presently, I am using this:

if (preg_match ('/^[a-zA-Z0-9_]+([a-zA-Z0-9_]*[.-]?[a-zA-Z0-9_]*)*[a-zA-Z0-9_]+$/', $product) ) {
return true;
} else { 
return false
}

For example, I want to match:

  1. pro.duct-name_
  2. _pro.duct.name
  3. p.r.o.d_u_c_t.n-a-m-e

But I don't want to match:

  1. pro..ductname
  2. .productname-
  3. -productname.
  4. -productname
like image 660
banskt Avatar asked May 26 '12 06:05

banskt


People also ask

Does regex match dot space?

The dot matches all characters in the string --including whitespaces.

What is the correct regex to match a URL with letters numbers underscores and dots Python?

Practical Data Science using Python You can use regular expressions to achieve this task. In order to verify that the string only contains letters, numbers, underscores and dashes, we can use the following regex: "^[A-Za-z0-9_-]*$".

How do you match periods in regex?

The period (.) represents the wildcard character. Any character (except for the newline character) will be matched by a period in a regular expression; when you literally want a period in a regular expression you need to precede it with a backslash.


3 Answers

The answer would be

/^[a-zA-Z0-9_]+([-.][a-zA-Z0-9_]+)*$/

if only you allowed strings containing .- and -. NOT to match. Why would you allow them to match, anyway? But if you really need these strings to match too, a possible solution is

/^[a-zA-Z0-9_]+((\.(-\.)*-?|-(\.-)*\.?)[a-zA-Z0-9_]+)*$/

The single . or - of the first regex is replaced by a sequence of alternating . and -, starting with either . or -, optionally followed by -. or .- pairs respectively, optionally followed by a - or . respectively, to allow for an even number of alternating chars. This complexity is probably an overshoot, but appears to be needed by current specifications. If a max of 2 alternating . and - is required, the regex becomes

/^[a-zA-Z0-9_]+((\.-?|-\.?)[a-zA-Z0-9_]+)*$/

Test here or here

like image 138
Walter Tross Avatar answered Sep 28 '22 08:09

Walter Tross


Try this

(?im)^([a-z_][\w\.\-]+)(?![\.\-])\b

UPDATE 1

(?im)^([a-z_](?:[\.\-]\w|\w)+(?![\.\-]))$

UPDATE 2

(?im)^([a-z_](?:\.\-\w|\-\.\w|\-\w|\.\w|\w)+)$

Explanation

<!--
(?im)^([a-z_](?:\.\-\w|\-\.\w|\-\w|\.\w|\w)+)$

Match the remainder of the regex with the options: case insensitive (i); ^ and $ match at line breaks (m) «(?im)»
Assert position at the beginning of a line (at beginning of the string or after a line break character) «^»
Match the regular expression below and capture its match into backreference number 1 «([a-z_](?:\.\-\w|\-\.\w|\-\w|\.\w|\w)+)»
   Match a single character present in the list below «[a-z_]»
      A character in the range between “a” and “z” «a-z»
      The character “_” «_»
   Match the regular expression below «(?:\.\-\w|\-\.\w|\-\w|\.\w|\w)+»
      Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
      Match either the regular expression below (attempting the next alternative only if this one fails) «\.\-\w»
         Match the character “.” literally «\.»
         Match the character “-” literally «\-»
         Match a single character that is a “word character” (letters, digits, and underscores) «\w»
      Or match regular expression number 2 below (attempting the next alternative only if this one fails) «\-\.\w»
         Match the character “-” literally «\-»
         Match the character “.” literally «\.»
         Match a single character that is a “word character” (letters, digits, and underscores) «\w»
      Or match regular expression number 3 below (attempting the next alternative only if this one fails) «\-\w»
         Match the character “-” literally «\-»
         Match a single character that is a “word character” (letters, digits, and underscores) «\w»
      Or match regular expression number 4 below (attempting the next alternative only if this one fails) «\.\w»
         Match the character “.” literally «\.»
         Match a single character that is a “word character” (letters, digits, and underscores) «\w»
      Or match regular expression number 5 below (the entire group fails if this one fails to match) «\w»
         Match a single character that is a “word character” (letters, digits, and underscores) «\w»
Assert position at the end of a line (at the end of the string or before a line break character) «$»
-->

And you could test it here.

like image 29
Cylian Avatar answered Sep 28 '22 10:09

Cylian


This should do:

/^[A-z0-9_]([.-]?[A-Z0-9_]+)*[.-]?[A-z0-9_]$/

It will make sure that the word begins and ends with alphanumeric or underscore character. The bracket in the middle will make sure that there will be at most one period or dash in a row, followed by at least one alphanumeric or underscore character.

like image 32
domvoyt Avatar answered Sep 28 '22 10:09

domvoyt