Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Complex string splitting

I have a string like the following:

[Testing.User]|Info:([Testing.Info]|Name:([System.String]|Matt)|Age:([System.Int32]|21))|Description:([System.String]|This is some description)

You can look at it as this tree:

- [Testing.User]
- Info
        - [Testing.Info]
        - Name
                - [System.String]
                - Matt
        - Age
                - [System.Int32]
                - 21
- Description
        - [System.String]
        - This is some description

As you can see, it's a string serialization / representation of a class Testing.User

I want to be able to do a split and get the following elements in the resulting array:

 [0] = [Testing.User]
 [1] = Info:([Testing.Info]|Name:([System.String]|Matt)|Age:([System.Int32]|21))
 [2] = Description:([System.String]|This is some description)

I can't split by | because that would result in:

 [0] = [Testing.User]
 [1] = Info:([Testing.Info]
 [2] = Name:([System.String]
 [3] = Matt)
 [4] = Age:([System.Int32]
 [5] = 21))
 [6] = Description:([System.String]
 [7] = This is some description)

How can I get my expected result?

I'm not very good with regular expressions, but I am aware it is a very possible solution for this case.

like image 576
Matias Cicero Avatar asked Jun 04 '15 00:06

Matias Cicero


1 Answers

Using regex lookahead

You can use a regex like this:

(\[.*?])|(\w+:.*?)\|(?=Description:)|(Description:.*)

Working demo

The idea behind this regex is to capture in groups 1,2 and 3 what you want.

You can see it easily with this diagram:

Regular expression visualization

Match information

MATCH 1
1.  [0-14]   `[Testing.User]`
MATCH 2
2.  [15-88]  `Info:([Testing.Info]|Name:([System.String]|Matt)|Age:([System.Int32]|21))`
MATCH 3
3.  [89-143] `Description:([System.String]|This is some description)`

Regular regex

On the other hand, if you don't like above regex, you can use another one like this:

(\[.*?])\|(.*)\|(Description:.*)

Regular expression visualization

Working demo

Or even forcing one character at least:

(\[.+?])\|(.+)\|(Description:.+)

Regular expression visualization

like image 191
Federico Piazza Avatar answered Sep 30 '22 05:09

Federico Piazza