Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When should we choose nHibernate over other ORMs?

I confess, I don't fully grok *Hibernate.

Since micro ORMs like Dapper can be used to address most data access needs, what are the scenarios that require a big gun like nHibernate? What are some examples of situations where nHibernate would shine? To be clear, I don't consider "the ability to switch out your database without changing code" too much of an advantage. In eight years of programming I've never actually had to do that, and it seems like a time wasting idea to begin with.

I'm open to any thoughtful response but here are some examples of questions I have:

  1. When is the query API worth the extra work you have to put into mapping, compared to something like Dapper?
  2. How can you leverage lazy loading in a way that limits development effort and just works?
  3. When is it worth your time to figure out how to batch statements?
  4. In what scenario is the cache system better than, for example, page output caching? Is that only true in non-distributed environments?
  5. How is a mere mortal like me expected to understand how nHibernate would work in a distributed environment? Consider mixing caching, batching, and stateless sessions and thinking about how load balanced web servers would deal with all that.
like image 522
Milimetric Avatar asked Aug 01 '12 21:08

Milimetric


2 Answers

Wow, big question. Not sure a mere mortal can answer it. But I think you are being too quick to dismiss "the ability to switch out your database". There are a lot of software packages, both commercial and open source, that offer the ability to work with different RDBMS' as a backing store. Managing the SQL for deploying to 2+ database platforms can become an absolute nightmare, so having something build your SQL in a predictable way (at least compared to having it hand-written) is a huge advantage. Just playing devil's advocate, for some database platforms I could see a growth in transaction throughput making a database choice prohibitively expensive to stay with as well. Most ORM's will help you with this in one way or another - though having a rich query API can go a long way when your database needs are sufficiently complex.

The short answer I think is that when your database needs for an application reach a certain level of complexity, the cost of meeting your requirement without becomes lower than the cost involved with nhibernate's learning curve. I can't offer complete answers but will try to give my thoughts on your list items.

  1. When you are doing more than just CRUD. The need for complex queries on multiple database platforms is probably a good example. On this type of app you can almost end up maintaining two separate codebases (well, they really become detached if you go the stored proc route) and there can be value in keeping all your code in .net (It's nice to be able to unit test these queries with the rest of your code, for example).
  2. Aside from problems seen in medium trust environments, I'm not sure what about lazy-loading doesn't "just work" now. The only problem with lazy-loading in my eyes is that you need to be aware of it to avoid some of the problems that can come with it when fetching large amounts of data, mostly the N+1 select problem.
  3. You don't need to figure out how to batch statements - you just need to set a configuration value and forget about it. This is a pretty huge optimization that NHibernate does for you with minimal effort - your code can be a lot cleaner when it is only directly concerned with operations and transaction control.
  4. Caching the data returned can be beneficial when you are rendering your pages differently for different users, or doing any kind of non-trivial processing in your domain layer. Even in basic scenarios, with page output caching you could end up having the edit page, the details page, etc... in your cache, whereas caching your data closer to the source you only need to cache the entity once. Caching closer to the source also gives you more protection from serving stale data. A data-oriented cache can also be shared across multiple applications, either via services or by pointing nHibernate to an out-of-process store like memcached or redis. This can be tremendously valuable in some environments.
  5. I'm not sure you need to understand how it works (a lot of times I use open-source libraries to protect myself from needing to understand the implementation details of this kind of thing). But the short answer is that none of those behave any differently in a distributed scenario except for caching (and only the 2nd level caching there). As long as you use a distributed cache provider (or point all your servers to the same out-of-process cache provider) you should be good on that front as well.

I'm only speaking of nHibernate, but I imagine for Hibernate the story is much the same. For larger scale, more complex applications there can be a lot of benefit, but there is a lot of additional complexity that you need to take on in order to reap this benefit - it's still probably less complex than rolling your own solution to all the problems *Hibernate solves for you.

You also had a lot of questions around caching it seems. I suggest reading over this link to get an idea of how the first and second level caches work. I won't try to explain here, because it sounds like you are after a deeper understanding than I can fit into this already lengthy reply :)

like image 114
AlexCuse Avatar answered Sep 27 '22 19:09

AlexCuse


NHibernate is big and powerful but you don't have to know everything about it to have success using it. To answer your questions:

  1. All of the .NET micro-ORMS don't have any LINQ support that I know of and instead rely on mixing SQL strings in your code. Building your queries with LINQ provides you with type-safety, compile-time checking, and great refactorability. Try refactoring a codebase with thousands of queries in it if all of the queries are using SQL strings... yikes! And by refactoring, I mean something as simple as adding new columns, new tables, etc. which is something that happens all the time in an enterprise environment. Refactoring strings is possible, heck that's what people still have to do who rely on stored procedures, but I sure wouldn't choose to do that if I had type-safety at my disposal.

  2. With lazy loading, the only thing you have to keep in mind is creating a SELECT N+1 scenario. Any time you've got code doing a foreach loop over a domain object/entity, make sure that the query that populated the object(s) used the .Fetch() method which simply creates a JOIN in SQL and populates any child objects. Otherwise, when you foreach over the object and dot into any child objects, the ORM will have to execute another SELECT statement to "fetch" the data. Basically, in NHibernate lingo, eager fetching is your friend.

  3. Batching is as easy as pie with NHibernate. In your NHibernate configuration, turn batching on and that's it. After that, if you do need to, you can adjust the batch size at runtime for particular queries.

  4. I've never made use of the 2nd level caching myself. I work in a big enterprise environment and our apps run very fast without any need for caching. The first level cache in NHibernate though comes with no configuration needed and can simply be thought of as change tracking. Basically, internally NHibernate keeps a dictionary of which objects it has already retrieved from the database and which objects are pending to be saved/updated in the database. The first level cache is something I don't ever really think about but I guess it's good to know at a rudimentary level how it works.

  5. Again, I currently work in an enterprise environment and we have all sorts of applications using NHibernate; some pretty basic and others using all the powerful features that NHibernate has to offer. What I've typically seen in my experience though, is that not every member on the team needs to be an NHibernate expert. Typically 1 - 3 developers will be very knowledgeable and everyone else will not worry about it and just create their entities, mappings, and carry on with their programming. Once the infrastructure is in place and your organization has sorted out the patterns it wishes to use, everything is typically a breeze.

Additional thoughts:

One place that NHibernate really shines is its ability to map any kind of crazy database design that you throw at it. Now I'm not saying that it is going to be easy to map some crazy database design put together in the early 90's where you have to join a stored procedure and another table together but it is possible. I've done some crazy mappings in my day. Some I even thought were impossible because the database was just not designed to do what we wanted it to do but every time, with persistence, I still manage to pull it off with NHibernate's incredible flexibility in mapping both good and bad database design.

With Micro-ORMS, you typically end up with a ton of SQL strings embedded alongside your code. How is that considered clean and efficient. It just seems like all people are doing is putting what they use to put in stored procedures in their code now. I actually do use Micro-ORMs in some of my projects though where it makes sense but typically when I'm only querying over one table for some simply data --no complicated WHERE clauses.

To be fair, I am one of those people who has invested quite a bit of time learning the ins and outs of NHibernate but not because I needed to for work, but because I just wanted to. I work with plenty of people at work who use NHibernate every day and don't fully get it. But again, they don't need to. You just need to know a few basic things and you are good to go.

Hope this helps.

like image 35
Randy Burden Avatar answered Sep 27 '22 18:09

Randy Burden