Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regex of three optional parts with separators

I am trying to parse the three optional components of a string with this format any of these different combinations:

12345,abcd@ABCD   -> $1=12345 $2=abcd $3=ABC
12345,abcd        -> $1=12345 $2=abcd $3=empty
12345@ABCD        -> $1=12345 $2=empty $3=ABC
12345             -> $1=12345 $2=emty $3=empty

Is it possible with a single regexp? I have done several attempts. When the string is complete no problem but the forms with parameters missing are escaping to me:

(.+),(.+)@(.+)           // works when the string is complete
                         //  but how do you express the optionality?
(.+),?(.+)@?(.+)         // nope
(.*)[$,](.*)[$@](.*)     // neither

(Another option, would be splitting the string into the components that looks quite trivial but I am curious about the regexp way)

like image 703
tru7 Avatar asked Mar 02 '23 22:03

tru7


2 Answers

12345,abcd@ABCD   -> $1=12345 $2=abcd $3=ABC
12345,abcd        -> $1=12345 $2=abcd $3=empty
12345@ABCD        -> $1=12345 $2=empty $3=ABC
12345             -> $1=12345 $2=emty $3=empty

From your expected output it appears that you want empty groups in your matches while matching your inputs. You may use this regex:

/^(\d+),?([^@\n]*)@?(.*)$/g

RegEx Demo

Note that this regex will always return 3 captured groups in every match result.

RegEx Details:

  • ^: Start
  • (\d+): Match 1+ digits and capture in group #1
  • ,?: Match an optional comma
  • ([^@]*): Match 0+ any character that is not @ and capture in group #2
  • @?: Match an optional @
  • (.*): Match 0+ any character and capture in group #3
  • $: End
like image 186
anubhava Avatar answered Mar 05 '23 15:03

anubhava


You can use

^([^,@]+)(?:,([^@]+))?(?:@(.+))?$

See the regex demo (note there are newlines added in the demo pattern since the test is performed against a single multiline string there, in real world, the strings to test won't contain newlines, hence they are not in the pattern here.)

Details

  • ^ - start of string
  • ([^,@]+) - Group 1: one or more chars other than a comma and @
  • (?:,([^@]+))? - an optional non-capturing group matching 1 or 0 occurrences of a comma and then (capturing into Group 2) any one or more chars other than @
  • (?:@(.+))? - an optional non-capturing group matching 1 or 0 occurrences of a @ char and then (capturing into Group 3) any one or more chars other than line break chars as many as possible
  • $ - end of string.
like image 23
Wiktor Stribiżew Avatar answered Mar 05 '23 17:03

Wiktor Stribiżew