Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can we use regex to resolve multiple rules at once?

I am in need to prevent some user input interaction and on top of my mind is regex immediately.

I'll first write as best as I can what I need to accomplish and then I'll post the code that I wrote to accomplish most of these, but still not perfect...

  • Take the string
  • Do not allow it to start typing with a space
  • Do not allow multiple spaces continuously
  • Do not allow multiple dots continuously
  • Do not allow multiple apostrophies
  • Allow following characters [a-z] '.
  • Do not allow string to cross over 14 characters

Now, here's my code:

this.name = this.name
  .replace(/^[\s]+/, '')     // prevent starting with space
  .replace(/\s\s+/, ' ')     // prevent multiplace spaces
  .replace(/\.\./, '.')      // prevent multiple dots
  .replace(/''/, '\'')       // prevent multiple apostrophies
  .replace(/[^ a-z'.]/i, '') // allowed
  .toUpperCase()             // transform
  .substring(0, 14)          // do not allow more than 14 characters

Questions:

  • Can we and how if so, accomplish all those or most of these regex rules in single replace?

  • How can I fix/improve my regex rules to not allow more than single ., I made it so that it does not allow continuous entries, but a user can use two dots like M.G.K - even tho I want to allow only single . entry in whole string?

  • Same as above, but for ' (apostrophe)?

like image 780
dvlden Avatar asked Sep 01 '17 15:09

dvlden


People also ask

What is multiline regex?

Multiline option, or the m inline option, enables the regular expression engine to handle an input string that consists of multiple lines. It changes the interpretation of the ^ and $ language elements so that they match the beginning and end of a line, instead of the beginning and end of the input string.

Can you chain regex?

Chaining regular expressionsRegular expressions can be chained together using the pipe character (|). This allows for multiple search options to be acceptable in a single regex string.

Which are 3 uses of regular expression?

Regular expressions are used in search engines, in search and replace dialogs of word processors and text editors, in text processing utilities such as sed and AWK, and in lexical analysis.


1 Answers

The first requirement to "cram" everything into 1 regex replace operation is rather easy to implement, because all the replacements you need to do are using the chars that you match. You may use .replace(/^\s+|(\s)\s+|(\.)\.|(')'|[^ a-z'.]+/ig, '$1$2$3'), see this regex demo.

Details

  • ^\s+ - start of string (^) and then 1+ whitespaces
  • | - or
  • (\s)\s+ - a single whitespace captured into Group 1 and then 1+ whitespace chars
  • | - or
  • (\.)\. - a . captured into Group 2, and then a .
  • | - or
  • (')' - a ' captured into Group 3 and then a '
  • | - or
  • [^ a-z'.]+ - 1 or more chars other than a space, ASCII letter, ' and ..

The /i modifier makes a-z match in a case insensitive way, g enables multiple matching. The $1 refers to the value in Group 1, $2 references the value in Group 2 and the $3 refers to the Group 3 value. Note that if they are not matched, these values in groups are empty strings, thus we may use three of them together in a single string replacement pattern.

The second and third requirements require two more separate regex replace operations. The point is to match and capture all chars up to the second occurrence of ' and . and just match the second occurrence of ' and ., and then replace with the backreference to the first group: 1) .replace(/^([^']*'[^']*)'/, '$1') (demo) and 2) .replace(/^([^.]*\.[^.]*)\./, '$1') (demo).

Details

  • ^ - start of string anchor
  • ([^']*'[^']*) - Group 1:
    • [^']* - any 0+ chars other than ' (a [^...] is a negated character class that matches any chars other than defined inside the class)
    • ' - a single quote
    • [^']* - any 0+ chars other than '
  • ' - a single quote.

This match is replaced with $1, the contents of the first capturing group.

The third pattern is analogous to the second parttern, just ' is replaced with . / \. (note that inside a character class, a . is treated as a literal ., it does not match any char but line break chars inside [...]).

like image 62
Wiktor Stribiżew Avatar answered Sep 19 '22 13:09

Wiktor Stribiżew