Well I've been trying to find out the difference between data mapper and repository, but up to now I still have not. It seems to me that the expert programmer said "Repository is another layer of abstraction over the mapping layer where query construction code is concentrated". It seems understandable but is still somewhat very abstract. I read this article on stackoverflow before, and it just made me even more confused: How is the Data Mapper pattern different from the Repository Pattern?
I guess what I need are simple explanations and concrete/practical examples on how the two patterns differ, and what a repository does what a data mapper doesnt, and vice versa. Do anyone of you know a good example on illustrating the concept of data mapper and repository? It will be better if it's the same example, just one using data mapper and another using repository. Thanks, I'd very appreciate this. I am still very confused as of now...
This is useful when one needs to model and enforce strict business processes on the data in the domain layer that do not map neatly to the persistent data store. The layer is composed of one or more mappers (or Data Access Objects), performing the data transfer. Mapper implementations vary in scope.
The Data Mapper is a layer of software that separates the in-memory objects from the database. Its responsibility is to transfer data between the two and also to isolate them from each other.
Among ORMs, there are a two very common philosophies or patterns: Active Record and Data Mapper. These two have advantages and disadvantages, we'll explore them both in this article.
The Repository pattern. Repositories are classes or components that encapsulate the logic required to access data sources. They centralize common data access functionality, providing better maintainability and decoupling the infrastructure or technology used to access databases from the domain model layer.
Suppose your application manages Person
objects, with each instance having name
, age
and jobTitle
properties.
You would like to persist such objects, retrieve them from the persistence medium and maybe update (say, on their birthday, increment the age) or delete. These tasks are usually referred to as CRUD, from Create, Read, Update and Delete.
It is preferable to decouple your "business" logic from the logic that deals with the persistence of Person
objects. This allows you to change the persistence logic (e.g. going from a DB to a distributed file system) without affecting your business logic.
You do this by encapsulating all persistence logic behind a Repository
. A hypothetical PersonRepository
(or Repository<Person>
) would allow you to write code like this:
Person johnDoe = personRepository.get(p=> p.name == "John Doe");
johnDoe.jobTitle = "IT Specialist";
personRepository.update(johnDoe);
This is just business logic and doesn't care about how and where the object is stored.
On the other side of the Repository
, you use both a DataMapper
and something that translates queries from the functional description (p=> p.name == "John Doe"
) to something that the persistence layer understands.
Your persistence layer can be a DB, in which case the DataMapper
converts a Person
object to and from a row in a PersonsTable
. The query translator then converts the functional query into SELECT * FROM PersonsTable WHERE name == "John Doe"
.
Another persistence layer can be a file system, or another DB format that chooses to store Person
objects in two tables, PersonAge
and PersonJobTitle
.
In the latter case, the DataMapper
is tasked with converting the johnDoe
object into 2 rows: one for the PersonAge
table and one for the PersonJobTitle
table. The query logic then needs to convert the functional query into a join
on the two tables. Finally, the DataMapper
needs to know how to construct a Person
object from the query's result.
In large, complex systems, you want to use small components that do small, clearly defined things, that can be developed and tested independently:
Repository
when it wants to read or persist objects, and doesn't care how that is implemented.Repository
deals with a DataMapper
when it wants to read/write an object in a particular persistence medium.Repository
relies on a schema provided by the DataMapper
(e.g. the jobTitle
value is found in the JobTitle
column in the PersonTable
table) but not on any implementation of a mapper.DataMapper
relies on a DB layer, that shield it from the Oracle/Sybase/MSSQL/OtherProvider details.The patterns don't "differ", they just expose different basic features.
I realize that this answer is kind of late, but it may help someone in the future that stumbles upon this same question and finds that the available answer(s) do not quite answer the question (which I felt when I first came across this question).
After having read PoEAA (Martin Fowler), I too was having trouble identifying the difference between a data mapper and a repository.
This is what I've found that the 2 concepts ultimately boil down to:
Repositories are a generic concept and don't necessarily have to store anything to a database, its main function is to provide collection like (query-enabled) access to domain objects (whether they are gotten from a database is besides the point). Repositories may (and often will) contain DataMappers themselves.
DataMappers serve as the middle layer between domain objects and a database, allowing them to evolve independently without any one depending on the other. Datamappers might have "find" or query functionality, but that is not really their main function. The more you find that you are using elaborate query logic in your DataMappers, the more you want to start thinking about decoupling that query logic into a repository while leaving your DataMappers to serve their main function, mapping domain objects to the database and vice versa.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With