Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split a text file which has multiple split string

Tags:

c#

regex

I am trying to read a text file which has delimiters of space and as well as double quotes and it is there is not a easy way to identify this scenario, I just wanted to check if this can be achieved using predefined Regular expression otherwise I need to start working on custom split

Here is the string

"myfile-one two" "1" 3 1453454.00 -134557.63 585.0 24444.8 -999 "NULL" "" 45.60 "" 67°32'5.23455"N 54°56'65.3454"W "NULL" 6.00

The output should be

myfile-one two
1
3
1453454.00
-134557.63
585.0
24444.8
-999
NULL
45.60

67°32'5.23455"N
54°56'65.3454"W
NULL
6.00

below code try to first split into space delimiter and this split even within the double quotes as well and made as separate entry

char[] space = new Char[] { ' ' };

string[] data = comp.Split(space, StringSplitOptions.RemoveEmptyEntries);
like image 408
Jey Avatar asked May 16 '26 21:05

Jey


1 Answers

You may match any substrings between double quotes that are not enclosed with whitespaces and capture what is inside them into a named group, or match any 1+ non-whitespace chars and capture into the indentically named group and use

var results = Regex.Matches(str, @"(?<!\S)""(?<o>.*?)""(?!\S)|(?<o>\S+)")
                .Cast<Match>()
                .Select(m => m.Groups["o"].Value)
                .ToList();

See the regex demo.

Pattern details

  • (?<!\S) - a whitespace or start of string is required immediately to the left of the current location
  • " - a double quotation mark
  • (?<o>.*?) - Group "o": any 0+ chars other than newline, as few as possible
  • " - a double quotation mark
  • (?!\S) - a whitespace or end of string is required immediately to the right of the current location
  • | - or
  • (?<o>\S+) - Group "o": any 1+ non-whitespace chars.

.NET allows the use of the identically named groups inside one regex pattern accumulating the values found into the corresponding memory buffer that you may "collect" via .Select(m => m.Groups["o"].Value).

like image 140
Wiktor Stribiżew Avatar answered May 18 '26 11:05

Wiktor Stribiżew