I'm writing a program that has to remove separator characters from quoted strings in text files.
For example:
"Hello, my name is world"
Has to be:
"Hello my name is world"
This sounds quite easy at first (I thought it would be), but you need to detect when the quote starts, when the quote ends, then search that specific string for separator characters. How?
I've experimented with some Regexs but I just keep getting myself confused!
Any ideas? Even just something to get the ball rolling, I'm just completely stumped.
string pattern = "\"([^\"]+)\"";
value = Regex.Match(textToSearch, pattern).Value;
string[] removalCharacters = {",",";"}; //or any other characters
foreach (string character in removalCharacters)
{
value = value.Replace(character, "");
}
why not try and do it with Linq ?
var x = @" this is a great whatever ""Hello, my name is world"" and all that";
var result = string.Join(@"""", x.Split('"').
Select((val, index) => index%2 == 1 ?
val.Replace(",", "") : val).ToArray());
Using a regex pattern with a look-ahead the pattern would be: "\"(?=[^\"]+,)[^\"]+\""
The \"
matches the opening double-quote. The look-ahead (?=[^\"]+,)
will try to match a comma within the quoted text. Next we match the rest of the string as long as it's not a double-quote [^\"]+
, then we match the closing double-quote \"
.
Using Regex.Replace
allows for a compact approach to altering the result and removing the unwanted commas.
string input = "\"Hello, my name, is world\"";
string pattern = "\"(?=[^\"]+,)[^\"]+\"";
string result = Regex.Replace(input, pattern, m => m.Value.Replace(",", ""));
Console.WriteLine(result);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With