In order to perform a join-like operation, we can use both GraphQL and Mongoose to achieve that end.
Before asking any question, I would like to give the following example of Task/Activities (none of this code is tested, it is given just for the example's sake):
Task {
_id,
title,
description,
activities: [{ //Of Activity Type
_id,
title
}]
}
In mongoose, we can retrieve the activities related to a task with the populate method, with something like this:
const task = await TaskModel.findbyId(taskId).populate('activities');
Using GraphQL and Dataloader, we can have the same result with something like:
const DataLoader = require('dataloader');
const getActivitiesByTask = (taskId) => await ActivityModel.find({task: taskId});
const dataloaders = () => ({
activitiesByTask: new DataLoader(getActivitiesByTask),
});
// ...
// SET The dataloader in the context
// ...
//------------------------------------------
// In another file
const resolvers = {
Query: {
Task: (_, { id }) => await TaskModel.findbyId(id),
},
Task: {
activities: (task, _, context) => context.dataloaders.activitiesByTask.load(task._id),
},
};
I tried to see if there is any article that demonstrates which way is better regarding performance, resource exhaustion,...etc but I failed to find any comparison of the two methods.
Any insight would be helpful, thanks!
Mongoose has a more powerful alternative called populate() , which lets you reference documents in other collections. Population is the process of automatically replacing the specified paths in the document with document(s) from other collection(s).
Mongoose's populate() method does not use MongoDB's $lookup behind the scenes. It simply makes another query to the database. Mongoose does not have functionalities that MongoDB does not have.
What are GraphQL DataLoaders? A dataloader is a generic utility used as part of your application's data fetching layer to provide a simplified and consistent API over various remote data sources, such as databases or web services, via batching and caching.
It's important to note that dataloaders are not just an interface for your data models. While dataloaders are touted as a "simplified and consistent API over various remote data sources" -- their main benefit when coupled with GraphQL comes from being able to implement caching and batching within the context of a single request. This sort of functionality is important in APIs that deal with potentially redundant data (think about querying users and each user's friends -- there's a huge chance of refetching the same user multiple times).
On the other hand, mongoose's populate
method is really just a way of aggregating multiple MongoDB requests. In that sense, comparing the two is like comparing apples and oranges.
A more fair comparison might be using populate
as illustrated in your question as opposed to adding a resolver for activities
along the lines of:
activities: (task, _, context) => Activity.find().where('id').in(task.activities)
Either way, the question comes down to whether you load all the data in the parent resolver, or let the resolvers further down do some of the work. because resolvers are only called for fields that are included in the request, there is a potential major impact to performance between these two approaches.
If the activities
field is requested, both approaches will make the same number of roundtrips between the server and the database -- the difference in performance will probably be marginal. However, your request might not include the activities
field at all. In that case, the activities
resolver will never be called and we can save one or more database requests by creating a separate activities
resolver and doing the work there.
On a related note...
From what I understand, aggregating queries in MongoDB using something like $lookup
is generally less performant than just using populate
(some conversation on that point can be found here). In the context of relational databases, however, there's additional considerations to ponder when considering the above approaches. That's because your initial fetch in the parent resolver could be done using joins, which will generally be much faster than making separate db requests. That means at the expense of making the no-activities-field queries slower, you can make the other queries significantly faster.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With