Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should parameters/returns of collections be IEnumerable<T> or T[]?

Tags:

.net

linq

As I've been incorporating the Linq mindset, I have been more and more inclined to pass around collections via the IEnumerable<T> generic type which seems to form the basis of most Linq operations.

However I wonder, with the late evaluation of the IEnumerable<T> generic type if that is a good idea. Does it make more sense to use the T[] generic type? IList<T>? Or something else?

Edit: The comments below are quite interesting. One thing that has not gotten addressed though seems to be the issue of thread safety. If, for example, you take an IEnumerable<T> argument to a method and it gets enumerated in a different thread, then when that thread attempts to access it the results might be different than those that were meant to be passed in. Worse still, attempting to enumerate an IEnumerable<T> twice - I believe throws an exception. Shouldn't we be striving to make our methods thread safe?

like image 835
George Mauer Avatar asked Dec 28 '08 18:12

George Mauer


1 Answers

I went through a phase of passing around T[], and to cut a long story short, it's a pain in the backside. IEnumerable<T> is much better

However I wonder, with the late evaluation of the IEnumerable generic type if that is a good idea. Does it make more sense to use the T[] generic type? IList? Or something else

Late evaluation is precisely why IEnumerable is so good. Here's an example workflow:

IEnumerable<string> files = FindFileNames();
IEnumerable<string> matched = files.Where( f => f.EndsWith(".txt") );
IEnumerable<string> contents = matched.Select( f => File.ReadAllText(f) );
bool foundContents = contents.Any( s => s.Contains("orion") );

For the impatient, this gets a list of filenames, filters out .txt files, then sets the foundContents to true if any of the text files contain the word orion.

If you write the code using IEnumerable as above, you will only load each file one by one as you need them. Your memory usage will be quite low, and if you match on the first file, you prevent the need to look at any subsequent files. It's great.

If you wrote this exact same code using arrays, you'd end up loading all the file contents up front, and only then (if you have any RAM left) would any of them be scanned. Hopefully this gets the point across about why lazy lists are so good.

One thing that has not gotten addressed though seems to be the issue of thread safety. If, for example, you take an IEnumerable<T> argument to a method and it gets enumerated in a different thread, then when that thread attempts to access it the results might be different than those that were meant to be passed in. Worse still, attempting to enumerate an IEnumerable<T> twice - I believe throws an exception. Shouldn't we be striving to make our methods thread safe?

Thread safety is a giant red herring here.

If you used an array rather than an enumerable, it looks like it should be safer, but it's not. Most of the time when people return arrays of objects, they create a new array, and then put the old objects in it. If you return that array, then those original objects can then be modified, and you end up with precisely the kind of threading problems you're trying to avoid.

A partial solution is to not return an array of the original objects, but an array of new or cloned objects, so other threads can't access the original ones. This is useful, however there's no reason an IEnumerable solution can't also do this. One is no more threadsafe than the other.

like image 137
Orion Edwards Avatar answered Oct 02 '22 06:10

Orion Edwards