Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Better way to add spaces between double semicolons

Tags:

string

c#

replace

My task is simple: I have a CSV file inside a C# string, split with semicolons. I need to add spaces for each empty cell. A;B;;;;C; should become A;B; ; ; ;C;. Right now, I'm using the replace method twice:

csv = csv.Replace(";;", "; ;").Replace(";;", "; ;");

That's necessary, because in the first pass, it will replace any occurance of ;; with a space between, but there's no lookback, so the second semicolon of the replaced sequence won't be checked again. Therefore I would end up with a A;B; ;; ;C;, which is not what I want.

Is there a more elegant, clear, and less redundand way to solve that task?

like image 860
André Reichelt Avatar asked Feb 12 '20 08:02

André Reichelt


2 Answers

You can try to Split string into the parts, then replace empty entries with spaces using Select (it requires using System.Linq;) and Join the entries back

var str = "A;B;;;;C";
var parts = str.Split(';').Select(p => string.IsNullOrEmpty(p) ? " " : p);

var result = string.Join(";", parts);

The output will be the following A;B; ; ; ;C

Benchmark result in comparison with OP code and Regex solution:

enter image description here

What is the clear and more elegant is up to you decision. Benchmark code for the reference is below

[SimpleJob]
public class Benchmark
{
    string input= "A;B;;;;C";

    [Benchmark]
    public string SplitJoinTest()
    {
        var parts = input.Split(';').Select(p => string.IsNullOrEmpty(p) ? " " : p);
        return string.Join(";", parts);
    }

    [Benchmark]
    public string DoubleReplaceTest()
    {
        return input.Replace(";;", "; ;").Replace(";;", "; ;");
    }

    [Benchmark]
    public string RegexTest()
    {
        return Regex.Replace(input, ";(?=;)", "; ");
    }
}
like image 130
Pavel Anikhouski Avatar answered Nov 15 '22 05:11

Pavel Anikhouski


One way is to use regular expressions.

using System.Text.RegularExpressions;

var result = Regex.Replace("A;B;;;;C;", ";(?=;)", "; ");

We replace every semicolon that is followed by another semicolon with the string "; ".

It's definitely less redundant, and it's clear if you know how to read regex :) Whether it is more elegant is up to you to decide.

like image 34
Sweeper Avatar answered Nov 15 '22 07:11

Sweeper