Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

repositories and querying with raw sql?

I am struggling to understand how to best query a repository.

The three factors that are throwing me through a loop right now are:

  1. Return type of data
  2. Columns to run query on
  3. Number of records to return

Point 1

In regards to question one:

I have Repositories with lot of methods that return a combination of both Entities and scalar values. This seems to lead to "method explosion". Should I always return an Entity object? How should I query for objects where I only need one column?

Point 2 When running a query should I include every column in the table even if I only need one, or two columns? If I create specific queries for this it leads to more methods in the Repository

Point 3 How should I provide conditions for the query? I read about Specifications, but my understanding is that you loop through the returned records and filter out the ones that pass into a new collection. This doesn't seem like a good idea performance wise. Right Now I just make a new method in the Repo like getNameById() which encapsulates the condition.

Please not that I am not using an ORM, I just have raw sql in my Repositories.

Update

Point 1: Based on the answers and a bit more research would this be a good implementation?

Right now I have a large repository that return a mix of scalar and entity type objects (all same entity). I'm thinking I could reduce this greatly if I just use a GetUser(userId) method and forget writing methods that just return single column values.

For example if I need to return a user name I could call the GetUser(userId) method that hydrates the User object and then in the service layer just filter it down to the username.

Another way would be to use some sort of QueryBuilder class I could pass into the Repository which could be parsed to generate the proper sql.

Point 2

Looking back this is pretty similar to point one and my current solution would be to just grab all table fields. It's a tradeoff between performance and maintainability.

Point 3

I would need to provide some sort of where clause. I'm not sure if this make sense doing via Specification or just a sql string. My current solution is to make new methods for these types, but I would like something more generic for the Repository

Overall, still researching into this... I'd love to hear more input into this or links to books or references that kind of tie this all together.

like image 513
chobo Avatar asked Feb 27 '12 20:02

chobo


People also ask

What is a raw query in SQL?

Raw SQL, sometimes also called native SQL, is the most basic, most low-level form of database interaction. You tell the database what to do in the language of the database. Most developers should know basics of SQL. This means how to CREATE tables and views, how to SELECT and JOIN data, how to UPDATE and DELETE data.

Should I use raw SQL or ORM?

ORMs provide a code-first API to express queries, whereas raw SQL queries are often expressed as plain strings. These query strings are often more concise than the equivalent ORM operation and allow queries to be represented in a language-agnostic way.

Is raw SQL faster than ORM?

Methods. To measure the performance of both techniques, we developed a blog application and we ran database operations select, insert and update in both techniques. Conclusions. Results indicated that overall Raw SQL performed better than Eloquent ORM in our database operations.

Should I use raw SQL in Django?

Django uses the primary key to identify model instances, so it must always be included in a raw query.


2 Answers

I have Repositories with lot of methods that return a combination of both Entities and scalar values. This seems to lead to "method explosion". Should I always return an Entity object? How should I query for objects where I only need one column?

You can fight repository method explosion similar to how you would fight other SRP violations. You can create another repository for the same entity. See this answer to a similar question.

When running a query should I include every column in the table even if I only need one, or two columns? If I create specific queries for this it leads to more methods in the Repository

This is not a DDD question. Domain driven design does not deal with 'rows and columns'. There is always some redundancy in how much data you load to 'hydrate' the domain object, but you have to measure whether this really affects your performance. If this is really a performance bottleneck than it maybe a symptom of incorrect domain model.

How should I provide conditions for the query? I read about Specifications, but my understanding is that you loop through the returned records and filter out the ones that pass into a new collection. This doesn't seem like a good idea performance wise. Right Now I just make a new method in the Repo like getNameById() which encapsulates the condition.

This again is a data access issue. Nothing in DDD says that your repository can not convert Specification to a SQL query. It is up to you whether you do this or iterate over records in memory (as long as repository consumer only sees Specification and Repository and stays unaware of the actual implementation).

Regarding 'Raw SQL vs. ORM in DDD' you may find this answer interesting.

like image 120
Dmitry Avatar answered Oct 04 '22 03:10

Dmitry


I agree with everything Dmitry says, but perhaps think you should have a read of CQRS.

I used to ask similar questions when getting started with DDD (regarding 'method explosion', not your SQL issues), and this lead me to CQRS. Personally, I don't really see how DDD is practical without it, and it answers a lot of these sorts of questions when it comes to querying data. Using it's principles what I'd suggest is:

  1. Only use domain repositories when committing a transaction. That is, you don't use repositories to display data in the UI. You only fetch aggregates from your repository when you want to perform an operation against them.
  2. Your repositories only return aggregates, not individual entities separately. This makes sense now as we are only using repositories in a transactional sense, and entities can only be mutated via atomic operations and persisted by the aggregate as a whole.
  3. You create separate repositories (or 'query services') which provide tailor made queries and data types for whatever data you need. These can return dumb DTOs with no logic.

This keeps your proper domain & repositories clean, whilst providing the means to create a thin data access layer that provides high performing queries.

Regarding the specification pattern: rather than converting it to a SQL query in code, you could provide public properties on the specification that represent the criteria. These values could then be added in the where clause of your SQL or sent as parameters to a SPROC.

like image 35
David Masters Avatar answered Oct 04 '22 03:10

David Masters