Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

regex - efficiently capitalize all shortcuts from given list in a text

I have a list of shortcuts:

var shortcuts = ["efa","ame","ict","del","aps","lfb","bis","bbc"...

and a body of text of various capitalisation:

var myText = "Lorem ipsum... Efa, efa, EFA ...";

Is it possible to replace all the words in the text that match the shortcut list with a capitalised version of the shortcut using regex? Is it possible to do that without a loop only using String.prototype.replace()?

The desired outcome in my example would be:

myText = "Lorem ipsum... EFA, EFA, EFA ...";
like image 387
daniel.sedlacek Avatar asked Jan 20 '17 15:01

daniel.sedlacek


People also ask

What does ?= Mean in regular expression?

?= is a positive lookahead, a type of zero-width assertion. What it's saying is that the captured match must be followed by whatever is within the parentheses but that part isn't captured. Your example means the match needs to be followed by zero or more characters and then a digit (but again that part isn't captured).

How do you denote special characters in regex?

Special Regex Characters: These characters have special meaning in regex (to be discussed below): . , + , * , ? , ^ , $ , ( , ) , [ , ] , { , } , | , \ . Escape Sequences (\char): To match a character having special meaning in regex, you need to use a escape sequence prefix with a backslash ( \ ). E.g., \.

What is regex in Nodejs?

A regular expression is a sequence of characters that forms a search pattern. When you search for data in a text, you can use this search pattern to describe what you are searching for. A regular expression can be a single character, or a more complicated pattern.

How do I create a regular expression?

The Escape Symbol : \ ' etc characters, add a backslash( \ ) before that character. This will tell the computer to treat the following character as a search character and consider it for matching pattern. Example : \d+[\+-x\*]\d+ will match patterns like "2+2" and "3*9" in "(2+2) * 3*9".


2 Answers

Generate a single regex with the array of string and replace the string using String#replace method with a callback.

var shortcuts = ["efa", "ame", "ict", "del", "aps", "lfb", "bis", "bbc"];

var myText = "Lorem ipsum... Efa, efa, EFA ...";

// construct the regex from the string
var regex = new RegExp(
  shortcuts
  // iterate over the array and escape any symbol
  // which has special meaning in regex, 
  // this is an optional part only need to use if string cotains any of such character
  .map(function(v) {
    // use word boundary in order to match exact word and to avoid substring within a word
    return '\\b' + v.replace(/[|\\{}()[\]^$+*?.]/g, '\\$&') + '\\b';
  })
  
  // or you can use word boundary commonly by grouping them
  // '\\b(?:' + shortcuts.map(...).join('|') + ')\\b'
  
  // join them using pipe symbol(or) although add global(g)
  // ignore case(i) modifiers
  .join('|'), 'gi');

console.log(
  // replace the string with capitalized text
  myText.replace(regex, function(m) {
    // capitalize the string
    return m.toUpperCase();
  })
  // or with ES6 arrow function
  // .replace(regex, m => m.toUpperCase())
);

Refer : Converting user input string to regular expression

like image 74
Pranav C Balan Avatar answered Oct 24 '22 09:10

Pranav C Balan


Assuming you control the initial shortcuts array and you know that it only contains characters:

const shortcuts = ["efa","ame","ict","del","aps","lfb","bis","bbc"]

var text = "Lorem ipsum... Efa, efa, EFA, ame, America, enamel, name ..."

var regex = new RegExp("\\b(" + shortcuts.join('|') + ")\\b", 'gi')

console.log(text.replace(regex, s => s.toUpperCase()));

The \b boundaries will avoid replacing the shortcuts inside words.

like image 32
Jozef Legény Avatar answered Oct 24 '22 11:10

Jozef Legény