Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should GraphQL DataLoader wrap request to database or wrap requests to service methods?

I have very common GraphQL schema like this (pseudocode):

Post {
  commentsPage(skip: Int, limit: Int) {
    total: Int
    items: [Comment]
  }
}

So to avoid n+1 problem when requesting multiple Post objects I decided to use Facebook's Dataloader.

Since I'm working on Nest.JS 3-tier layered application (Resolver-Service-Repository), I have question:

should I wrap my repository methods with DataLoader or should I wrap my service methods with Dataloder?

Below is example of my service method that returns Comments page (i.e. this method called from commentsPage property resolver). Inside service method I'm using 2 repository methods (#count and #find):

@Injectable()
export class CommentsService {
    constructor(
        private readonly repository: CommentsRepository,
    ) {}

    async getCommentsPage(postId, dataStart, dateEnd, skip, limit): PaginatedComments {
        const counts = await this.repository.getCount(postId, dateStart, dateEnd);
        const itemsDocs = await this.repository.find(postId, dateStart, dateEnd, skip, limit);
        const items = this.mapDbResultToGraphQlType(itemsDocs);
        return new PaginatedComments(total, items)
    }
}

So should I create individual instances of Dataloader for each of repository method (#count, #find etc) or should I just wrap my entire service method with Dataloader (so my commentsPage property resolver will just work with Dataloader not with service)?

like image 383
WelcomeTo Avatar asked Jul 25 '19 13:07

WelcomeTo


People also ask

What is N 1 problem in GraphQL?

The n+1 problem means that the GraphQL server executes many unnecessary round trips to fetch nested data. If you check the service log then you will see that rating Data Fetcher is called sequentially for each and every call to book.

How does GraphQL store data?

GraphQL is a query language for your API, and a server-side runtime for executing queries by using a type system you define for your data. GraphQL isn't tied to any specific database or storage engine and is instead backed by your existing code and data. You can connect any database using GraphQL.

Is GraphQL a database?

Is GraphQL a Database Technology? No. GraphQL is often confused with being a database technology. This is a misconception, GraphQL is a query language for APIs - not databases.


1 Answers

Disclaimer: I am not an expert in Nest.js but I have written a good bunch of dataloaders as well as worked with automatically generated dataloaders. I hope I can give a bit of insight nonetheless.

What is the actual problem?

While your question seems to be a relatively simple either or question it is probably much more difficult than that. I think the actual problem is the following: Whether to use the dataloader pattern or not for a specific field needs to be decided on a per field basis. The repository+service pattern on the other hand tries to abstract away this decision by exposing abstract and powerful ways of data access. One way out would be to simply "dataloaderify" every method of your service. Unfortunately in practice this is not really feasable. Let's explore why!

Dataloader is made for key-value-lookups

Dataloader provides a promise cache to reduce dublicated calls to the database. For this cache to work all requests need to be simple key value lookups (e.g. userByIdLoader, postsByUserIdLoader). This quickly becomes not sufficient enough, like in one of your example your request to the repository has a lot of parameters:

this.repository.find(postId, dateStart, dateEnd, skip, limit);

Sure technically you could make { postId, dateStart, dateEnd, skip, limit } your key and then somehow hash the content to generate a unique key.

Writing Dataloader queries is an order of magnitude harder than normal queries

When you implement a dataloader query it now suddenly has to work for a list of the inputs the initial query needed. Here a simple SQL example:

SELECT * FROM user WHERE id = ?
-- Dataloaded
SELECT * FROM user WHERE id IN ?

Okay now the repository example from above:

SELECT * FROM comment WHERE post_id = ? AND date < ? AND date > ? OFFSET ? LIMIT ?
-- Dataloaded
???

I have sometimes written queries that work for two parameters and they already become very difficult problems. This is why most dataloaders are simply load by id lookups. This tread on twitter discusses how a GraphQL API should only expose what can be efficiently queried. If you create service methods with strong filter methods you have the same problem even if your GraphQL API does not expose these filters.

Okay so what is the solution?

The first thing to my understanding that Facebook does is match fields and service methods very closely. You could do the same. This way you can make a decision in the service method if you want to use a dataloader or not. For example I don't use dataloaders in root queries (e.g. { getPosts(filter: { createdBefore: "...", user: 234 }) { .. }) but in subfields of types that appear in lists { getAllPosts { comments { ... } }. The root query is not going to be executed in a loop and is therefore not exposed to the n+1 problem.

Your repository now exposes what can be "efficiently queried" (as in Lee's tweet) like foreign/primary key lookups or filtered find all queries. The service can then wrap for example the key lookups in a dataloader. Often I end up filtering small lists in my business logic. I think this is perfectly fine for small apps but might be problematic when you scale. The GraphQL Relay helpers for JavaScript do something similar when you use the connectionFromArray function. The pagination is not done on the database level and this is probably okay for 90% of connections.

Some sources to consider

  • GraphQL before GraphQL - Dan Schafer
  • Dataloader source code walkthrough - Lee Byron
  • There is another talk from this years GraphQL conf that discusses the data access at FB but I don't think it is uploaded yet. I might come back when it has been published.
like image 137
Herku Avatar answered Oct 20 '22 04:10

Herku