Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

String Builder vs Lists

I am reading in multiple files in with millions of lines and I am creating a list of all line numbers that have a specific issue. For example if a specific field is left blank or contains an invalid value.

So my question is what would be the most efficient date type to keep track of a list of numbers that could be upwards of a million number of rows. Would using String Builder, Lists, or something else be more efficient?

My end goal is to out put a message like "Specific field is blank on 1-32, 40, 45, 47, 49-51, etc. So in the case of a String Builder, I would check the previous value and if it is is only 1 more I would change it from 1 to 1-2 and if it was more than one would separate it by a comma. With the List, I would just add each number to the list and then combine then once the file has been completely read. However in this case I could have multiple list containing millions of numbers.

Here is the current code I am using to combine a list of numbers using String Builder:

string currentLine = sbCurrentLineNumbers.ToString();
string currentLineSub;

StringBuilder subCurrentLine = new StringBuilder();
StringBuilder subCurrentLineSub = new StringBuilder();

int indexLastSpace = currentLine.LastIndexOf(' ');
int indexLastDash = currentLine.LastIndexOf('-');

int currentStringInt = 0;

if (sbCurrentLineNumbers.Length == 0)
{
    sbCurrentLineNumbers.Append(lineCount);
}
else if (indexLastSpace == -1 && indexLastDash == -1)
{
    currentStringInt = Convert.ToInt32(currentLine);

    if (currentStringInt == lineCount - 1)
        sbCurrentLineNumbers.Append("-" + lineCount);
    else
    {
        sbCurrentLineNumbers.Append(", " + lineCount);
        commaCounter++;
    }
}
else if (indexLastSpace > indexLastDash)
{
    currentLineSub = currentLine.Substring(indexLastSpace);
    currentStringInt = Convert.ToInt32(currentLineSub);

    if (currentStringInt == lineCount - 1)
        sbCurrentLineNumbers.Append("-" + lineCount);
    else
    {
        sbCurrentLineNumbers.Append(", " + lineCount);
        commaCounter++;
    }
}
else if (indexLastSpace < indexLastDash)
{
    currentLineSub = currentLine.Substring(indexLastDash + 1);
    currentStringInt = Convert.ToInt32(currentLineSub);

    string charOld = currentLineSub;
    string charNew = lineCount.ToString();

    if (currentStringInt == lineCount - 1)
        sbCurrentLineNumbers.Replace(charOld, charNew);
    else
    {
        sbCurrentLineNumbers.Append(", " + lineCount);
        commaCounter++;
    }
}   
like image 872
buzzzzjay Avatar asked Oct 14 '25 23:10

buzzzzjay


1 Answers

My end goal is to out put a message like "Specific field is blank on 1-32, 40, 45, 47, 49-51

If that's the end goal, no point in going through an intermediary representation such as a List<int> - just go with a StringBuilder. You will save on memory and CPU that way.

like image 77
Oded Avatar answered Oct 17 '25 13:10

Oded



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!