Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular expression match all except first occurence

I need a regular expression to match all occurrences of a dot (.) except the first one.

For example if the source is: aaa.bbb.ccc..ddd

the expression should match the dots after bbb and ccc but not the dot after aaa. In other works it should match all dots except the first one.

I need it for javascript regex.

like image 847
gabrielgeo Avatar asked Feb 15 '16 16:02

gabrielgeo


People also ask

What does '$' mean in regex?

$ means "Match the end of the string" (the position after the last character in the string).

How do you match something before a word in regex?

A regular expression to match everything before a specific character makes use of a wildcard character and a capture group to store the matched value. Another method involves using a negated character class combined with an anchor.

How do I match a specific character in regex?

Match any specific character in a setUse square brackets [] to match any characters in a set. Use \w to match any single alphanumeric character: 0-9 , a-z , A-Z , and _ (underscore). Use \d to match any single digit. Use \s to match any single whitespace character.


2 Answers

with pcre (PHP, R) you can do that:

\G(?:\A[^.]*\.)?+[^.]*\K\.

demo

details:

\G # anchor for the start of the string or the position after a previous match
(?:\A[^.]*\.)?+ # start of the string (optional possessive quantifier)
[^.]* # all that is not a dot
\K    # remove all that has been matched on the left from the match result
\.    # the literal dot

With .net: (easy since you can use a variable length lookbehind)

(?<!^[^.]*)\.

demo


With javascript there is no way to do it with a single pattern.

using a placeholder:

var result = s.replace('.', 'PLACEHOLDER')
              .replace(/\./g, '|')
              .replace('PLACEHOLDER', '.');

(or replace all dots with | and then replace the first occurrence of | with a dot).

using split:

var parts = s.split('.');
var result = parts.shift() + (parts.length ? '.': '') + parts.join('|');

with a counter:

var counter = 0;
var result = s.replace(/\./g, (_) => counter++ ? '|' : '.');

With NodeJS (or any other implementation that allows lookbehinds):

var result = s.replace(/((?:^[^.]*\.)?(?<=.)[^.]*)\./g, "$1|");
like image 66
Casimir et Hippolyte Avatar answered Oct 22 '22 06:10

Casimir et Hippolyte


One-line solution for JavaScript using arrow function (ES6):

'aaa.bbb.ccc..ddd'
   .replace(/\./g, (c, i, text) => text.indexOf(c) === i ? c : '|')

-> 'aaa.bbb|ccc||ddd'
like image 40
Maxime Lechevallier Avatar answered Oct 22 '22 05:10

Maxime Lechevallier