Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

split a string from a sequence, with regex

Tags:

regex

bash

I need a regular expression that matches letters and digits, but doesn't match the sequence "00".

e.g. "hello00world00number001" should match: "hello", "world", "number" and "1".

I tested without success:

(?:[\w](?<!00))+

Edit: "hello000world0000number000001" must be separated into: "hello0" "world" "number0" and "1"

like image 596
acs Avatar asked Mar 25 '23 01:03

acs


2 Answers

Input string: hello000world0000number00000100test00test20

Split

  1. Splitting by 00 alone will generate empty matches if a series like 0000 is encountered:
    Output: hello/0world//number//01/test/test20

  2. To work around this let's enclose 2 zeroes in a group:
    RegEx: (00)+ - last uneven 0 in the series goes to the next match - live demo
    Output: hello/0world/number/01/test/test20

  3. Use a negative lookahead:
    RegEx: (00)+(?!0) - keep the first 0 in an uneven series in the first match - live demo
    Output: hello0/world/number0/1/test/test20

Match

  1. incorrect result for 00 only
  2. /([a-z0-9]+?)(?:(?:00)+|$)/gi - live demo
  3. /([a-z0-9]+?)(?:(?:00)+(?!0)|$)/gi - live demo
like image 184
CSᵠ Avatar answered Apr 05 '23 13:04

CSᵠ


str = "hello00world00number001"
str.split("00")

Why would this not work

like image 22
aaronman Avatar answered Apr 05 '23 13:04

aaronman