Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to efficiently determine if a string starts with a number and then get all following numbers up until the first non-numeric character?

Tags:

string

c#

I have a requirement to sort some strings that contain data like this:

var strings = new List<string>{"2009 Arrears","2008 Arrears","2008 Arrears Interest","2009 Arrears Interest"};

And they want the results ordered like this:

  1. "2009 Arrears"
  2. "2009 Arrears Interest"
  3. "2008 Arrears"
  4. "2008 Arrears Interest"

It seems like I need to create a function to see if the string starts with a number. If so, the function will get all numbers up until the first character and sort the numeric result descending and then sort the remaining characters ascending. I am having trouble trying to write a method that gets all starting numbers in a string. What would be an efficient way to do that?

like image 213
Rob Packwood Avatar asked Sep 30 '09 17:09

Rob Packwood


2 Answers

public int GetLeadingNumber(string input)
{
    char[] chars = input.ToCharArray();
    int lastValid = -1;

    for(int i = 0; i < chars.Length; i++)
    {
        if(Char.IsDigit(chars[i]))
        {
            lastValid = i;
        }
        else
        {
            break;
        }
    }

    if(lastValid >= 0)
    {
        return int.Parse(new string(chars, 0, lastValid + 1));
    }
    else
    {
        return -1;
    }
}

Though this would strictly be the most efficient, the regular expression solutions offered by other posters is obviously more concise and could be clearer, depending on how much processing you'll do on the string.

like image 81
Adam Robinson Avatar answered Oct 31 '22 18:10

Adam Robinson


A regex would split this up nicely:

var match = Regex.Match(text, @"^(\d+) (.*)$");

Then match.Groups[0].Value is the year, and match.Groups[1].Value is the title ("Arrears", "Arrears Interest", etc)

You can use LINQ to apply the sort (year descending, title ascending):

string[] titles = new[] { "2008 Arrears", "2009 Arrears" };

var sortedTitles = 
    from title in titles
    let match = Regex.Match(title, @"^(\d+) (.*)$")
    orderby match.Groups[0].Value descending, match.Groups[1].Value
    select title;

listBox.ItemsSource = sortedTitles.ToArray();  // for example

A regex may not be the fastest solution; here's an alternative that's still kept nice and clean with LINQ:

var sortedTitles =
    from title in titles
    let year = new string(title.TakeWhile(ch => char.IsDigit(ch)).ToArray())
    let remainder = title.Substring(year.Length).Trim()
    orderby year descending, remainder
    select title;
like image 45
Ben M Avatar answered Oct 31 '22 18:10

Ben M