Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I Split(',') a string while ignore commas in between quotes?

Tags:

string

c#

split

I am using the .Split(',') method on a string that I know has values delimited by commas and I want those values to be separated and put into a string[] object. This works great for strings like this:

78,969.82,GW440,.

But the values start to look different when that second value goes over 1000, like the one found in this example:

79,"1,013.42",GW450,....

These values are coming from a spreadsheet control where I use the controls built in ExportToCsv(...) method and that explains why a formatted version of the actual numerical value.

Question

Is there a way I can get the .Split(',') method to ignore commas inside of quotes? I don't actually want the value "1,013.42" to be split up as "1 and 013.42".

Any ideas? Thanks!

Update

I really would like to do this without incorporating a 3rd party tool as my use case really doesn't involve many other cases besides this one and even though it is part of my work's solution, having a tool like that incorporated doesn't really benefit anyone at the moment. I was hoping there was something quick to solve this particular use case that I was missing, but now that it is the weekend, I'll see if I can't give one more update to this question on Monday with the solution I eventually come up with. Thank you everyone for you assistance so far, I'll will assess each answer further on Monday.

like image 419
Jake Smith Avatar asked Jan 24 '14 21:01

Jake Smith


People also ask

How do you split a string with commas?

To split a string with comma, use the split() method in Java. str. split("[,]", 0); The following is the complete example.

How do you ignore a comma in a string in python?

sub() function to erase commas from the python string. The function re. sub() is used to swap the substring. Also, it will replace any match with the other parameter, in this case, the null string, eliminating all commas from the string.


2 Answers

You should probably read this article: Regular Expression for Comma Based Splitting Ignoring Commas inside Quotes Although it is for Java, but the regular expression is the same.

like image 186
Paweł Bejger Avatar answered Sep 28 '22 05:09

Paweł Bejger


This is a fairly straight forward CSV Reader implementation we use in a few projects here. Easy to use and handles those cases you are talking about.

First the CSV Class

public static class Csv
{
    public static string Escape(string s)
    {
        if (s.Contains(QUOTE))
            s = s.Replace(QUOTE, ESCAPED_QUOTE);

        if (s.IndexOfAny(CHARACTERS_THAT_MUST_BE_QUOTED) > -1)
            s = QUOTE + s + QUOTE;

        return s;
    }

    public static string Unescape(string s)
    {
        if (s.StartsWith(QUOTE) && s.EndsWith(QUOTE))
        {
            s = s.Substring(1, s.Length - 2);

            if (s.Contains(ESCAPED_QUOTE))
                s = s.Replace(ESCAPED_QUOTE, QUOTE);
        }

        return s;
    }


    private const string QUOTE = "\"";
    private const string ESCAPED_QUOTE = "\"\"";
    private static char[] CHARACTERS_THAT_MUST_BE_QUOTED = { ',', '"', '\n' };

}

Then a pretty nice Reader implementation - If you need it. You should be able to do what you need with just the CSV class above.

public sealed class CsvReader : System.IDisposable
{
    public CsvReader(string fileName)
        : this(new FileStream(fileName, FileMode.Open, FileAccess.Read))
    {
    }

    public CsvReader(Stream stream)
    {
        __reader = new StreamReader(stream);
    }

    public System.Collections.IEnumerable RowEnumerator
    {
        get
        {
            if (null == __reader)
                throw new System.ApplicationException("I can't start reading without CSV input.");

            __rowno = 0;
            string sLine;
            string sNextLine;

            while (null != (sLine = __reader.ReadLine()))
            {
                while (rexRunOnLine.IsMatch(sLine) && null != (sNextLine = __reader.ReadLine()))
                    sLine += "\n" + sNextLine;

                __rowno++;
                string[] values = rexCsvSplitter.Split(sLine);

                for (int i = 0; i < values.Length; i++)
                    values[i] = Csv.Unescape(values[i]);

                yield return values;
            }

            __reader.Close();
        }

    }

    public long RowIndex { get { return __rowno; } }

    public void Dispose()
    {
        if (null != __reader) __reader.Dispose();
    }

    //============================================


    private long __rowno = 0;
    private TextReader __reader;
    private static Regex rexCsvSplitter = new Regex(@",(?=(?:[^""]*""[^""]*"")*(?![^""]*""))");
    private static Regex rexRunOnLine = new Regex(@"^[^""]*(?:""[^""]*""[^""]*)*""[^""]*$");

}

Then you can use it like this.

var reader = new CsvReader(new FileStream(file, FileMode.Open));

Note: This would open an existing CSV file, but can be modified fairly easily to take a string[] like you need.

like image 43
Evan L Avatar answered Sep 28 '22 04:09

Evan L