Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove white spaces unless within quotes, ignoring escaped quotes

Tags:

c#

regex

I have a JSON string in which I would like to remove all white spaces that are not within quotes. I searched online and I already found a solution, which is the following:

aidstring = Regex.Replace(aidstring, "\\s+(?=([^\"]*\"[^\"]*\")*[^\"]*$)", "");

However, I am now dealing with a string that contains escaped quotes:

"boolean": "k near/3 \"funds private\""

and the above regular expression solution turns it into:

"boolean":"k near/3 \"fundsprivate\""

Since escaped quotes are treated as normal quotes.

Could anyone post a regex in which escaped quotes are ignored?

like image 770
Giovanni Borghi Avatar asked Sep 17 '25 11:09

Giovanni Borghi


1 Answers

I'd suggest using

aidstring = Regex.Replace(aidstring, @"(""[^""\\]*(?:\\.[^""\\]*)*"")|\s+", "$1");

See regex demo

The regex will match all C quoted strings into Capture group 1 and with $1 these strings will be restored in the result, but all whitespaces caught with \s+ will be removed.

Regex explanation:

Alternative 1:

  • ("[^"\\]*(?:\\.[^"\\]*)*"):
    • " - a literal "
    • [^"\\]* - zero or more characters other than \ or "
    • (?:\\.[^"\\]*)* - zero or more sequences of...
      • \\. - \ and any character but a newline
      • [^"\\]* - zero or more characters other than \ or "
    • " - a literal "

Alternative 2:

  • \s+ - 1 or more whitespace (in .NET, any Unicode whitespace)
like image 89
Wiktor Stribiżew Avatar answered Sep 20 '25 04:09

Wiktor Stribiżew