Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parse string with whitespace and quotation mark (with quotation mark retained)

Tags:

string

c#

regex

If I have a string like this

create myclass "56, 'for the better or worse', 54.781"

How can I parse it such that the result would be three string "words" which have the following content:

[0] create
[1] myclass
[2] "56, 'for the better or worse', 54.781"

Edit 2: note that the quotation marks are to be retained

At first, I attempted by using string.Split(' '), but I noticed that it would make the third string broken to few other strings.

I try to limit the Split result by using its count argument as 3 to solve this. And is it ok for this case, but when the given string is

create myclass false "56, 'for the better or worse', 54.781" //or
create myclass "56, 'for the better or worse', 54.781" false

Then the Split fails because the last two words will be combined.

I also created something like ReadInBetweenSameDepth to get the string in between the quotation mark

Here is my ReadInBetweenSameDepth method

//Examples:
    //[1] (2 + 1) * (5 + 6) will return 2 + 1
    //[2] (2 * (5 + 6) + 1) will return 2 * (5 + 6) + 1
public static string ReadInBetweenSameDepth(string str, char delimiterStart, char delimiterEnd) {
  if (delimiterStart == delimiterEnd || string.IsNullOrWhiteSpace(str) || str.Length <= 2)
    return null;
  int delimiterStartFound = 0;
  int delimiterEndFound = 0;
  int posStart = -1;
  for (int i = 0; i < str.Length; ++i) {
    if (str[i] == delimiterStart) {
      if (i >= str.Length - 2) //delimiter start is found in any of the last two characters
        return null; //it means, there isn't anything in between the two
      if (delimiterStartFound == 0) //first time
        posStart = i + 1; //assign the starting position only the first time...
      delimiterStartFound++; //increase the number of delimiter start count to get the same depth
    }
    if (str[i] == delimiterEnd) {
      delimiterEndFound++;
      if (delimiterStartFound == delimiterEndFound && i - posStart > 0)
        return str.Substring(posStart, i - posStart); //only successful if both delimiters are found in the same depth
    }
  }
  return null;
}

But though this function is working, I found it pretty hard to combine the result with the string.Split to make the correct parsing as I want.

Edit 2: In my poor solution, I need to re-add the quotation marks later on

Is there any better way to do this? If we use Regex, how do we do this?

Edit:

I honestly am unaware that this problem could be solved the same way as the CSV formatted text. Neither did I know that this problem is not necessarily solved by Regex (thus I labelled it as such). My sincere apology to those who see this as duplicate post.

Edit 2:

After working more on my project, I realized that there was something wrong with my question (that is, I did not include quotation mark) - My apology to the previously best answerer, Mr. Tim Schmelter. And then after looking at the dupe-link, I noticed that it doesn't provide the answer for this either.

like image 215
Ian Avatar asked Jan 05 '16 08:01

Ian


People also ask

How do you escape the quotation mark in a string?

You can use a backslash (\) with the particular word or string to escape the quotation mark. Remember one thing; if you do not want to use the backslash (\), you have to use the quotation mark alternatively inside and outside of a string. This means that if you try to use a single quote inside a string, the outside quotes should be double quotes.

Why do I need to surround the entire string with quotation marks?

If a string value contains an embedded space, then you must surround the entire string with quotation marks to prevent the AWS CLI from misinterpreting the space as a divider between the value and the next parameter name. Which type of quotation mark you use depends on the operating system you are running the AWS CLI on.

How to display string with double quotation mark in JavaScript?

See the below example, how it will be done: var ssq = 'It's an example of printing the single quote with string.'; It's an example of printing the single quote with string. In JavaScript, you can use &quot with a string to display string with double quotation mark. With &quot, you can use any quote.

How to place quotation marks in a string in Visual Basic?

To place quotation marks in a string in your code. In Visual Basic, insert two quotation marks in a row as an embedded quotation mark. In Visual C# and Visual C++, insert the escape sequence " as an embedded quotation mark. For example, to create the preceding string, use the following code.


1 Answers

You can split by this

\s(?=(?:[^"]*"[^"]*")*[^"]*$)

See demo.

https://regex101.com/r/fM9lY3/60

string strRegex = @"\s(?=(?:[^""]*""[^""]*"")*[^""]*$)";
Regex myRegex = new Regex(strRegex, RegexOptions.Multiline);
string strTargetString = @"create myclass ""56, 'for the better or worse', 54.781""";

return myRegex.Split(strTargetString);
like image 188
vks Avatar answered Sep 28 '22 06:09

vks