Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Trim String value with particular pattern in C#.NET

Tags:

I have a string which is 900-1000 characters long. the pattern string follows is

"Number:something,somestringNumber:something,somestring"

and so on example string:

"23:value,ordernew14:valueagain,orderagain"

the requirement is whenever it crosses more than 1000 characters, I have to remove first 500 characters. and then if doesnot starts with Number, i have to remove characters until I reach to point where first character is digit

sortinfo = sortinfo.Remove(0, 500);
sortinfo = new string(sortinfo.SkipWhile(c => !char.IsDigit(c)).ToArray());

I am able to do this with the help of above code

In the above example if i give remove 5 characters output will be

14:valueagain,orderagain

which is perfectly fine. but if the string has value :

23:value,or3dernew14:valueagain,orderagain

and remove 5 characters, output is

3dernew14:valueagain,orderagain

and requirement is to have

14:valueagain,orderagain

and hence its breaking everything as it is not in correct format. please help me how can I do this

my full code

class Program
{
    static void Main(string[] args)
    {
        string str;
        str=TrimSortInfo("23:value,ord4er24:valueag4ain,order6again15:value,order"); // breaking value
        //str = TrimSortInfo("23:value,order24:valueagain,orderagain15:value,order"); //working value
        Console.WriteLine(str);
        Console.ReadLine();

    }

    static string TrimSortInfo(string sortinfo)
    {
        if (sortinfo.Length > 15)
        {
            sortinfo = sortinfo.Remove(0, 15);
            sortinfo = new string(sortinfo.SkipWhile(c => !char.IsDigit(c))
                         .ToArray());
            return sortinfo;
        }
        return sortinfo;
    }
}
like image 415
Prashant Avatar asked Jun 18 '18 09:06

Prashant


1 Answers

Using a regex:

static Regex rx = new Regex("(?<=.*?)[0-9]+:.*");

static string TrimSortInfo(string sortinfo, int trimLength = 15)
{
    if (sortinfo.Length > trimLength)
    {
        return rx.Match(sortinfo, trimLength).Value;
    }
    return sortinfo;
}

Note that there is a big risk here: you could trim "in the middle" of the number.

So you could trim a "xxxxxxxxxxxxxx24:something" to "4:something".

The regex means: looking for a sequence of digits 0-9 (at least one digit) ([0-9]+), followed by a :, followed by all the other characters (.*). Before this sequence there can be any other character, but only the minimum quantity possible (?<=.*?). This pre-sequence isn't captured (?<=...).

In the end the regex can be simplified to:

static Regex rx = new Regex("[0-9]+:.*");

because it is unanchored, so the match will begin at the first occurrence of the match.

To solve this problem:

static Regex rx = new Regex("(?:[^0-9])([0-9]+:.*)");

static string TrimSortInfo(string sortinfo, int trimLength = 15)
{
    if (sortinfo.Length > trimLength)
    {
        return rx.Match(sortinfo, trimLength - 1).Groups[1].Value;
    }
    return sortinfo;
}

We cheat a little. To trim 15 characters, we skip 14 characters (trimLength - 1) then we capture a non-digit character (that we will ignore (?:[^0-9])) plus the digits and the : and everything else ([0-9]+:.*). Note the use of Groups[1].Value

like image 132
xanatos Avatar answered Sep 28 '22 19:09

xanatos