Which of the following collection types do you use in your JPA domain model and why:
java.util.Collection
java.util.List
java.util.Set
I was wondering whether there are some ground rules for this.
UPDATE I know the difference between a Set
and a List
. A List
allows duplicates and has an order and a Set
cannot contain duplicate elements and does not define order. I'm asking this question in the context of JPA. If you strictly follow the definition, then you should always end up using the Set
type, since your collection is stored in relational database, where you can't have duplicates and where you have define an order by yourself, i.e. the order in you Java List
is not necessarily preserved in the DB.
For example, most of the time I'm using the List
type, not because it has an order or allows duplicates (which I can't have anyway), because some of the components in my component library require a list.
JPA allows three kinds of objects to store in mapping collections - Basic Types, Entities and Embeddables.
The Java™ Persistence API (JPA) provides a mechanism for managing persistence and object-relational mapping and functions since the EJB 3.0 specifications. The JPA specification defines the object-relational mapping internally, rather than relying on vendor-specific mapping implementations.
Also, JPA is thought to be better suited for more sophisticated applications by many developers. But, JDBC is considered the preferable alternative if an application will use a simple database and we don't plan to migrate it to a different database vendor.
Hibernate is an implementation of JPA. Hence, the common standard which is given by JPA is followed by Hibernate. It is a standard API that permits to perform database operations. It is used in mapping Java data types with SQL data types and database tables.
Like your own question suggests, the key is the domain, not JPA. JPA is just a framework which you can (and should) use in a way which best fits your problem. Choosing a suboptimal solution because of framework (or its limits) is usually a warning bell.
When I need a set and never care about order, I use a Set
. When for some reason order is important (ordered list, ordering by date, etc.), then a List
.
You seem to be well aware of the difference between Collection
, Set
, and List
. The only reason to use one vs. the other depends only on your needs. You can use them to communicate to users of your API (or your future self) the properties of your collection (which may be subtle or implicit).
This is follows the exact same rules as using different collection types anywhere else throughout your code. You could use Object
or Collections
for all your references, yet in most cases you use more concrete types.
For example, when I see a List
, I know it comes sorted in some way, and that duplicates are either acceptable or irrelevant for this case. When I see a Set
, I usually expect it to have no duplicates and no specific order (unless it's a SortedSet
). When I see a Collection
, I don't expect anything more from it than to contain some entities.
Regarding list ordering... Yes, it can be preserved. And even if it's not and you just use @OrderBy
, it still can be useful. Think about the example of event log sorted by timestamp by default. Artificially reordering the list makes little sense, but still it can be useful that it comes sorted by default.
The question of using a Set or a List is much more difficult I think. At least when you use hibernate as JPA implementation. If you use a List in hibernate, it automatically switch to the "Bags" paradigm, where duplicates CAN exist.
And that decision has significant influence on the queries hibernate executes. Here a little example:
There are two entities, employee and company, a typical many-to-many relation. for mapping those entities to each other, a JoinTable (lets call it "employeeCompany") exist.
You choose the datatype List on both entities (Company/Employee)
So if you now decide to remove Employee Joe from CompanyXY, hibernate executes the following queries:
delete from employeeCompany where employeeId = Joe; insert into employeeCompany(employeeId,companyId) values (Joe,CompanyXA); insert into employeeCompany(employeeId,companyId) values (Joe,CompanyXB); insert into employeeCompany(employeeId,companyId) values (Joe,CompanyXC); insert into employeeCompany(employeeId,companyId) values (Joe,CompanyXD); insert into employeeCompany(employeeId,companyId) values (Joe,CompanyXE);
And now the question: why the hell does hibernate not only execute that query?
delete from employeeCompany where employeeId = Joe AND company = companyXY;
The answer is simple (and thx a lot to Nirav Assar for his blogpost): It can't. In a world of bags, delete all & re-insert all remaining is the only proper way! Read that for more clarification. http://assarconsulting.blogspot.fr/2009/08/why-hibernate-does-delete-all-then-re.html
Now the big conclusion:
If you choose a Set instead of a List in your Employee/Company - Entities, you don't have that Problem and only one query is executed!
And why that? Because hibernate is no longer in a world of bags (as you know, Sets allows no duplicates) and executing only one query is now possible.
So the decision between List and Sets is not that simple, at least when it comes to queries & performance!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With