Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimized Generic List Split

Read the edit below for more information.

I have some code below that I use to split a generic list of Object when the item is of a certain type.

    public static IEnumerable<object>[] Split(this  IEnumerable<object> tokens, TokenType type) {

        List<List<object>> t = new List<List<object>>();
        int currentT = 0;
        t.Add(new List<object>());
        foreach (object list in tokens) {
            if ((list is Token) && (list as Token).TokenType == type) {
                currentT++;
                t.Add(new List<object>());

            }
            else if ((list is TokenType) && ((TokenType)list )== type) {
                currentT++;
                t.Add(new List<object>());

            }
            else {
                t[currentT].Add(list);
            }
        }

        return t.ToArray();
    }

I dont have a clear question as much as I am curious if anyone knows of any ways I can optimize this code. I call it many times and it seems to be quite the beast as far as clock cycles go. Any ideas? I can also make it a Wiki if anyone is interested, maybe we can keep track of the latest changes.

Update: Im trying to parse out specific tokens. Its a list of some other class and Token classes. Token has a property (enum) of TokenType. I need to find all the Token classes and split on each of them.

{a b c T d e T f g h T i j k l T m} 

would split like

{a b c}{d e}{f g h}{i j k l}{m}

EDIT UPDATE: It seems like all of my speed problems come into the constant creation and addition of Generic Lists. Does anyone know how I can go about this without that? This is the profile of what is happening if it helps anyone.

alt text

like image 715
Dested Avatar asked Dec 05 '22 03:12

Dested


1 Answers

Your code looks fine.

My only suggestion would be replacing IEnumerable<object> with the non-generic IEnumerable. (In System.Collections)

EDIT:

On further inspection, you're casting more times than necessary.

Replace the if with the following code:

var token = list as Token;
if (token != null && token.TokenType == type) {

Also, you can get rid your currentT variable by writing t[t.Count - 1] or t.Last(). This will make the code clearer, but might have a tiny negative effect on performance.
Alternatively, you could store a reference to the inner list in a variable and use it directly. (This will slightly improve performance)

Finally, if you can change the return type to List<List<Object>>, you can return t directly; this will avoid an array copy and will be noticeably faster for large lists.

By the way, your variable names are confusing; you should swap the names of t and list.

like image 111
SLaks Avatar answered Dec 25 '22 03:12

SLaks