Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When to use each of T[], List<T>, IEnumerable<T>?

I usually find myself doing something like:

string[] things = arrayReturningMethod();
int index = things.ToList<string>.FindIndex((s) => s.Equals("FOO"));
//do something with index
return things.Distinct(); //which returns an IEnumerable<string>

and I find all this mixup of types/interface a bit confusing and it tickles my potential performance problem antennae (which I ignore until proven right, of course).

Is this idiomatic and proper C# or is there a better alternative to avoid casting back and forth to access the proper methods to work with the data?

EDIT: The question is actually twofold:

  • When is it proper to use either the IEnumerable interface or an array or a list (or any other IEnumerable implementing type) directly (when accepting parameters)?

  • Should you freely move between IEnumerables (implementation unknown) and lists and IEnumerables and arrays and arrays and Lists or is that non idiomatic (there are better ways to do it)/ non performant (not typically relevant, but might be in some cases) / just plain ugly (unmaintable, unreadable)?

like image 386
Vinko Vrsalovic Avatar asked Aug 04 '10 23:08

Vinko Vrsalovic


4 Answers

In regards to performance...

  • Converting from List to T[] involves copying all the data from the original list to a newly allocated array.
  • Converting from T[] to List also involves copying all the data from the original list to a newly allocated List.
  • Converting from either List or T[] to IEnumerable involves casting, which is a few CPU cycles.
  • Converting from IEnumerable to List involves upcasting, which is also a few CPU cycles.
  • Converting from IEnumerable to T[] also involves upcasting.
  • You can't cast an IEnumerable to T[] or List unless it was a T[] or List respectively to begin with. You can use the ToArray or ToList functions, but those will also result in a copy being made.
  • Accessing all the values in order from start to end in a T[] will, in a straightforward loop, be optimized to use straightforward pointer arithmetic -- which makes it the fastest of them all.
  • Accessing all the values in order from start to end in a List involves a check on each iteration to make sure that you aren't accessing a value outside the array's bounds, and then the actual accessing of the array value.
  • Accessing all the values in an IEnumerable involves creating an enumerator object, calling the Next() function which increases the index pointer, and then calling the Current property which gives you the actual value and sticks it in the variable that you specified in your foreach statement. Generally, this isn't as bad as it sounds.
  • Accessing an arbitrary value in an IEnumerable involves starting at the beginning and calling Next() as many times as you need to get to that value. Generally, this is as bad as it sounds.

In regards to idioms...

In general, IEnumerable is useful for public properties, function parameters, and often for return values -- and only if you know that you're going to be using the values sequentially.

For instance, if you had a function PrintValues, if it was written as PrintValues(List<T> values), it would only be able to deal with List values, so the user would first have to convert, if for instance they were using a T[]. Likewise with if the function was PrintValues(T[] values). But if it was PrintValues(IEnumerable<T> values), it would be able to deal with Lists, T[]s, stacks, hashtables, dictionaries, strings, sets, etc -- any collection that implements IEnumerable, which is practically every collection.

In regards to internal use...

  • Use a List only if you're not sure how many items will need to be in it.
  • Use a T[] if you know how many items will need to be in it, but need to access the values in an arbitrary order.
  • Stick with the IEnumerable if that's what you've been given and you just need to use it sequentially. Many functions will return IEnumerables. If you do need to access values from an IEnumerable in an arbitrary order, use ToArray().

Also, note that casting is different from using ToArray() or ToList() -- the latter involves copying the values, which is indeed a performance and memory hit if you have a lot of elements. The former simply is to say that "A dog is an animal, so like any animal, it can eat" (downcast) or "This animal happens to be a dog, so it can bark" (upcast). Likewise, All Lists and T[]s are IEnumerables, but only some IEnumerables are Lists or T[]s.

like image 186
Rei Miyasaka Avatar answered Oct 21 '22 22:10

Rei Miyasaka


A good rule of thumb is to always use IEnumerable (when declaring your variables/method parameters/method return types/properties/etc.) unless you have a good reason not to. By far the most type-compatible with other (especially extension) methods.

like image 45
Kirk Woll Avatar answered Oct 21 '22 22:10

Kirk Woll


Well, you've got two apples and an orange that you are comparing.

The two apples are the array and the List.

  • An array in C# is a C-style array that has garbage collection built in. The upside of using them it that they have very little overhead, assuming you don't need to move things around. The bad thing is that they are not as efficient when you are adding things, removing things, and otherwise changing the array around, as memory gets shuffled around.

  • A List is a C# style dynamic array (similar to the vector<> class in C++). There is more overhead, but they are more efficient when you need to be moving things around a lot, as they will not try to keep the memory usage contiguous.

The best comparison I could give is saying that arrays are to Lists as strings are to StringBuilders.

The orange is 'IEnumerable'. This is not a datatype, but rather it is an interface. When a class implements the IEnumerable interface, it allows that object to be used in a foreach() loop.

When you return the list (as you did in your example), you were not converting the list to an IEnumerable. A list already is an IEnumerable object.

EDIT: When to convert between the two:

It depends on the application. There is very little that can be done with an array that cannot be done with a List, so I would generally recommend the List. Probably the best thing to do is to make a design decision that you are going to use one or the other, that way you don't have to switch between the two. If you rely on an external library, abstract it away to maintain consistent usage.

Hope this clears a little bit of the fog.

like image 45
riwalk Avatar answered Oct 21 '22 20:10

riwalk


Looks to me like the problem is that you haven't bothered learning how to search an array. Hint: Array.IndexOf or Array.BinarySearch depending on whether the array is sorted.

You're right that converting to a list is a bad idea: it wastes space and time and makes the code less readable. Also, blindly upcasting to IEnumerable slows matters down and also completely prevents use of certain algorithms (such as binary search).

like image 43
Ben Voigt Avatar answered Oct 21 '22 22:10

Ben Voigt