Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best Practice List/Array/ReadOnlyCollection creation (and usage)

My code is littered with collections - not an unusual thing, I suppose. However, usage of the various collection types isn't obvious nor trivial. Generally, I'd like to use the type that's exposes the "best" API, and has the least syntactic noise. (See Best practice when returning an array of values, Using list arrays - Best practices for comparable questions). There are guidelines suggesting what types to use in an API, but these are impractical in normal (non-API) code.

For instance:

new ReadOnlyCollection<Tuple<string,int>>(
    new List<Tuple<string,int>> {
        Tuple.Create("abc",3),
        Tuple.Create("def",37)
    }
)

List's are a very common datastructure, but creating them in this fashion involves quite a bit of syntactic noise - and it can easily get even worse (e.g. dictionaries). As it turns out, many lists are never changed, or at least never extended. Of course ReadOnlyCollection introduces yet more syntactic noise, and it doesn't even convey quite what I mean; after all ReadOnlyCollection may wrap a mutating collection. Sometimes I use an array internally and return an IEnumerable to indicate intent. But most of these approaches have a very low signal-to-noise ratio; and that's absolutely critical to understanding code.

For the 99% of all code that is not a public API, it's not necessary to follow Framework Guidelines: however, I still want a comprehensible code and a type that communicates intent.

So, what's the best-practice way to deal with the bog-standard task of making small collections to pass around values? Should array be preferred over List where possible? Something else entirely? What's the best way - clean, readable, reasonably efficient - of passing around such small collections? In particular, code should be obvious to future maintainers that have not read this question and don't want to read swathes of API docs yet still understand what the intent is. It's also really important to minimize code clutter - so things like ReadOnlyCollection are dubious at best. Nothing wrong with wordy types for major API's with small surfaces, but not as a general practice inside a large codebase.

What's the best way to pass around lists of values without lots of code clutter (such as explicit type parameters) but that still communicates intent clearly?

Edit: clarified that this is about making short, clear code, not about public API's.

like image 338
Eamon Nerbonne Avatar asked Jan 06 '11 12:01

Eamon Nerbonne


3 Answers

After hopefully understanding your question, i think you have to distinguish between what you create and manage within your class and what you make available to the outside world.

Within your class you can use whatever best fits your current task (pro/cons of List vs. Array vs. Dictionary vs. LinkedList vs. etc.). But this has maybe nothing to do about what you provide in your public properties or functions.

Within your public contract (properties and functions) you should give back the least type (or even better interface) that is needed. So just an IList, ICollection, IDictionary, IEnumerable of some public type. Thous leads that your consumer classes are just awaiting interfaces instead of concrete classes and so you can change the concrete implementation at a later stage without breaking your public contract (due to performance reasons use an List<> instead of a LinkedList<> or vice versa).

like image 127
Oliver Avatar answered Sep 28 '22 16:09

Oliver


Update:

So, this isn't strictly speaking new; but this question convinced me to go ahead and announce an open source project I've had in the works for a while (still a work in progress, but there's some useful stuff in there), which includes an IArray<T> interface (and implementations, naturally) that I think captures exactly what you want here: an indexed, read-only, even covariant (bonus!) interface.

Some benefits:

  • It's not a concrete type like ReadOnlyCollection<T>, so it doesn't tie you down to a specific implementation.
  • It's not just a wrapper (like ReadOnlyCollection<T>), so it "really is" read-only.
  • It clears the way for some really nice extension methods. So far the Tao.NET library only has two (I know, weak), but more are on the way. And you can easily make your own, too—just derive from ArrayBase<T> (also in the library) and override the this[int] and Count properties and you're done.

If this sounds promising to you, feel free to check it out and let me know what you think.


It's not 100% clear to me where you're worried about this "syntactic noise": in your code or in calling code?

If you're tolerant of some "noise" in your own encapsulated code then I would suggest wrapping a T[] array and exposing an IList<T> which happens to be a ReadOnlyCollection<T>:

class ThingsCollection
{
    ReadOnlyCollection<Thing> _things;

    public ThingsCollection()
    {
        Thing[] things = CreateThings();
        _things = Array.AsReadOnly(things);
    }

    public IList<Thing> Things
    {
        get { return _things; }
    }

    protected virtual Thing[] CreateThings()
    {
        // Whatever you want, obviously.
        return new Thing[0];
    }
}

Yes there is some noise on your end, but it's not bad. And the interface you expose is quite clean.

Another option is to make your own interface, something like IArray<T>, which wraps a T[] and provides a get-only indexer. Then expose that. This is basically as clean as exposing a T[] but without falsely conveying the idea that items can be set by index.

like image 32
Dan Tao Avatar answered Sep 28 '22 16:09

Dan Tao


I do not pass around Listss if I can possibly help it. Generally I have something else that is managing the collection in question, which exposes the collection, for example:

public class SomeCollection
{
    private List<SomeObject> m_Objects = new List<SomeObject>();

    // ctor
    public SomeCollection()
    {
        // Initialise list here, or wot-not/
    } // eo ctor


    public List<SomeObject> Objects { get { return m_Objects; } }
} // eo class SomeCollection

And so this would be the object passed around:

public void SomeFunction(SomeCollection _collection)
{
    // work with _collection.Objects
} // eo SomeFunction

I like this approach, because:

1) I can populate my values in the ctor. They're there the momeny anyone news SomeCollection.

2) I can restrict access, if I want, to the underlying list. In my example I exposed it all, but you don't have to do this. You can make it read-only if you want, or validate additions to the list, prior to adding them.

3) It's clean. Far easier to read SomeCollection than List<SomeObject> everywhere.

4) If you suddenly realise that your collection of choice is inefficient, you can change the underlying collection type without having to go and change all the places where it got passed as a parameter (can you imagine the trouble you might have with, say, List<String>?)

like image 33
Moo-Juice Avatar answered Sep 28 '22 16:09

Moo-Juice