Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

IEnumerable<IDisposable>: who disposes of what and when -- Did I get it right?

Here is a hypotetical scenario.

I have very large number of user names (say 10,000,000,000,000,000,000,000. Yes, we are in the intergalactic age :)). Each user has its own database. I need to iterate through the list of users and execute some SQL against each of the databases and print the results.

Because I learned the goodness of functional programming and because I deal with such a vast number of users, I decide to implement this using F# and pure sequences, (aka IEnumerable). And here I go.

// gets the list of user names
let users() : seq<string> = ...

// maps user name to the SqlConnection
let mapUsersToConnections (users: seq<string>) : seq<SqlConnection> = ...

// executes some sql against the given connection and returns some result
let mapConnectionToResult (conn) : seq<string> = ...

// print the result
let print (result) : unit = ...

// and here is the main program
users()
|> mapUsersToConnections
|> Seq.map mapConnectionToResult
|> Seq.iter print

Beautiful? Elegant? Absolutely.

But! Who and at what point disposes of the SqlConnections?

And I don't thing the answer mapConnectionToResult should do it is right, because it knows nothing about the lifetime of the connection it is given. And things may work or not work depending on how mapUsersToConnections is implemented and various other factors.

As mapUsersToConnections is the only other place which has access to connection it must be its responsibility to dispose of SQL connection.

In F#, this can be done like this:

// implementation where we return the same connection for each user
let mapUsersToConnections (users) : seq<SqlConnection> = seq {
    use conn = new SqlConnection()
    for u in users do
        yield conn
}


// implementation where we return new connection for each user
let mapUsersToConnections (users) : seq<SqlConnection> = seq {
    for u in users do
        use conn = new SqlConnection()
        yield conn
}

The C# equivalient would be:

// C# -- same connection for all users
IEnumerable<SqlConnection> mapUsersToConnections(IEnumerable<string> users)
{
    using (var conn = new SqlConnection())
    foreach (var u in users)
    {
        yield return conn;
    }
}

// C# -- new connection for each users
IEnumerable<SqlConnection> mapUsersToConnections(IEnumerable<string> user)
{
    foreach (var u in users)
    using (var conn = new SqlConnection())
    {
        yield return conn;
    }
}

The tests I performed suggest that the objects do get disposed correctly at correct points, even if executing stuff in parallel: once at the end of the whole iteration for the shared connection; and after each iteration cycle for the non-shared connection.

So, THE QUESTION: Did I get this right?

EDIT:

  1. Some answers kindly pointed out some errors in the code, and I made some corrections. The complete working example which compiles is below.

  2. The use of SqlConnection is for example purposes only, it's any IDisposable really.


Example which compiles

open System

// Stand-in for SqlConnection
type SimpeDisposable() =
    member this.getResults() = "Hello"
    interface IDisposable with
        member this.Dispose() = printfn "Disposing"

// Alias SqlConnection to our dummy
type SqlConnection = SimpeDisposable

// gets the list of user names
let users() : seq<string> = seq {
    for i = 0 to 100 do yield i.ToString()
}

// maps user names to the SqlConnections
// this one uses one shared connection for each user
let mapUsersToConnections (users: seq<string>) : seq<SqlConnection> = seq {
    use c = new SimpeDisposable()
    for u in users do
        yield c
}

// maps user names to the SqlConnections
// this one uses new connection per each user
let mapUsersToConnections2 (users: seq<string>) : seq<SqlConnection> = seq {
    for u in users do
        use c = new SimpeDisposable()
        yield c
}

// executes some "sql" against the given connection and returns some result
let mapConnectionToResult (conn:SqlConnection) : string = conn.getResults()

// print the result
let print (result) : unit = printfn "%A" result

// and here is the main program - using shared connection
printfn "Using shared connection"
users()
|> mapUsersToConnections
|> Seq.map mapConnectionToResult
|> Seq.iter print


// and here is the main program - using individual connections
printfn "Using individual connection"
users()
|> mapUsersToConnections2
|> Seq.map mapConnectionToResult
|> Seq.iter print

The results are:

Shared connection: "hello" "hello" ... "Disposing"

Individual connections: "hello" "Disposing" "hello" "Disposing"

like image 383
Philip P. Avatar asked Jul 21 '11 15:07

Philip P.


People also ask

What happens if you dont dispose IDisposable?

IDisposable is usually used when a class has some expensive or unmanaged resources allocated which need to be released after their usage. Not disposing an object can lead to memory leaks.

When to Dispose C#?

You should take advantage of the Dispose/Finalize pattern only when it is needed. To be more precise, you should use it only when your type invokes unmanaged code that allocates unmanaged resources (including unmanaged memory) and then it returns a handle that you must use eventually to release the resource.

What will happen if we call Dispose() method directly?

The Dispose() methodThe Dispose method performs all object cleanup, so the garbage collector no longer needs to call the objects' Object. Finalize override. Therefore, the call to the SuppressFinalize method prevents the garbage collector from running the finalizer. If the type has no finalizer, the call to GC.

Is IDisposable called automatically?

By default, the garbage collector automatically calls an object's finalizer before reclaiming its memory. However, if the Dispose method has been called, it is typically unnecessary for the garbage collector to call the disposed object's finalizer.


1 Answers

I'd avoid this approach since the structure would fail if an unwitting user of your library did something like

users()
|> Seq.map userToCxn
|> Seq.toList() //oops disposes connections
|> List.map .... // uses disposed cxns 
. . .. 

I'm not an expert on the issue, but I'd presume it's a best practice not to have sequences / IEnumerables muck with things after they yield them, for the reason that an intermediary ToList() call will produce different results than just acting on the sequence directly -- DoSomething(GetMyStuff()) will be different from DoSomething(GetMyStuff().ToList()).

Actually, why not just use sequence expressions for the whole thing, as that would get around this issue entirely:

seq{ for user in users do
     use cxn = userToCxn user
     yield cxnToResult cxn }

(Where userToCxn and cxnToResult are both simple one-to-one non-disposing functions). This seems more readable than anything and should yield the desired results, is parallelizable, and works for any disposable. This can be translated into C# LINQ using the following technique: http://solutionizing.net/2009/07/23/using-idisposables-with-linq/

from user in users
from cxn in UserToCxn(user).Use()
select CxnToResult(cxn)

Another take on this though would be to just define your "getSomethingForAUserAndDisposeTheResource" function first, and then use that as your fundamental building block:

let getUserResult selector user = 
    use cxn = userToCxn user
    selector cxn

Once you have this, then you can easily build up from there:

 //first create a selector
let addrSelector cxn = cxn.Address()
//then use it like this:
let user1Address1 = getUserResult addrSelector user1
//or more idiomatically:
let user1Address2 = user1 |> getUserResult addrSelector
//or just query dynamically!
let user1Address3 = user1 |> getUserResult (fun cxn -> cxn.Address())

//it can be used with Seq.map easily too.
let addresses1 = users |> Seq.map (getUserResult (fun cxn -> cxn.Address()))
let addresses2 = users |> Seq.map (getUserResult addrSelector)

//if you are tired of Seq.map everywhere, it's easy to create your own map function
let userCxnMap selector = Seq.map <| getUserResult selector
//use it like this:
let addresses3 = users |> userCxnMap (fun cxn -> cxn.Address())
let addresses4 = users |> userCxnMap addrSelector 

That way you aren't committed to retrieving the entire sequence if all you want is one user. I guess the lesson learned here is make your core functions simple and that makes it easier to build abstractions on top of it. And note none of these options will fail if you do a ToList somewhere in the middle.

like image 192
14 revs Avatar answered Jan 04 '23 09:01

14 revs