I'm attempting to use graphql to tie together a number of rest endpoints, and I'm stuck on how to filter, sort and page the resulting data. Specifically, I need to filter and/or sort by nested values.
I cannot do the filtering on the rest endpoints in all cases because they are separate microservices with separate databases. (i.e. I could filter on title
in the rest endpoint for articles, but not on author.name). Likewise with sorting. And without filtering and sorting, pagination cannot be done on the rest endpoints either.
To illustrate the problem, and as an attempt at a solution, I've come up with the following using formatResponse
in apollo-server, but am wondering if there is a better way.
I've boiled down the solution to the most minimal set of files that i could think of:
data.js represents what would be returned by 2 fictional rest endpoints:
export const Authors = [{ id: 1, name: 'Sam' }, { id: 2, name: 'Pat' }];
export const Articles = [
{ id: 1, title: 'Aardvarks', author: 1 },
{ id: 2, title: 'Emus', author: 2 },
{ id: 3, title: 'Tapir', author: 1 },
]
the schema is defined as:
import _ from 'lodash';
import {
GraphQLSchema,
GraphQLObjectType,
GraphQLList,
GraphQLString,
GraphQLInt,
} from 'graphql';
import {
Articles,
Authors,
} from './data';
const AuthorType = new GraphQLObjectType({
name: 'Author',
fields: {
id: {
type: GraphQLInt,
},
name: {
type: GraphQLString,
}
}
});
const ArticleType = new GraphQLObjectType({
name: 'Article',
fields: {
id: {
type: GraphQLInt,
},
title: {
type: GraphQLString,
},
author: {
type: AuthorType,
resolve(article) {
return _.find(Authors, { id: article.author })
},
}
}
});
const RootType = new GraphQLObjectType({
name: 'Root',
fields: {
articles: {
type: new GraphQLList(ArticleType),
resolve() {
return Articles;
},
}
}
});
export default new GraphQLSchema({
query: RootType,
});
And the main index.js is:
import express from 'express';
import { apolloExpress, graphiqlExpress } from 'apollo-server';
var bodyParser = require('body-parser');
import _ from 'lodash';
import rql from 'rql/query';
import rqlJS from 'rql/js-array';
import schema from './schema';
const PORT = 8888;
var app = express();
function formatResponse(response, { variables }) {
let data = response.data.articles;
// Filter
if ({}.hasOwnProperty.call(variables, 'q')) {
// As an example, use a resource query lib like https://github.com/persvr/rql to do easy filtering
// in production this would have to be tightened up alot
data = rqlJS.query(rql.Query(variables.q), {}, data);
}
// Sort
if ({}.hasOwnProperty.call(variables, 'sort')) {
const sortKey = _.trimStart(variables.sort, '-');
data = _.sortBy(data, (element) => _.at(element, sortKey));
if (variables.sort.charAt(0) === '-') _.reverse(data);
}
// Pagination
if ({}.hasOwnProperty.call(variables, 'offset') && variables.offset > 0) {
data = _.slice(data, variables.offset);
}
if ({}.hasOwnProperty.call(variables, 'limit') && variables.limit > 0) {
data = _.slice(data, 0, variables.limit);
}
return _.assign({}, response, { data: { articles: data }});
}
app.use('/graphql', bodyParser.json(), apolloExpress((req) => {
return {
schema,
formatResponse,
};
}));
app.use('/graphiql', graphiqlExpress({
endpointURL: '/graphql',
}));
app.listen(
PORT,
() => console.log(`GraphQL Server running at http://localhost:${PORT}`)
);
For ease of reference, these files are available at this gist.
With this setup, I can send this query:
{
articles {
id
title
author {
id
name
}
}
}
Along with these variables (It seems like this is not the intended use for the variables, but it was the only way I could get the post processing parameters into the formatResponse function.):
{ "q": "author/name=Sam", "sort": "-id", "offset": 1, "limit": 1 }
and get this response, filtered to where Sam is the author, sorted by id descending, and getting getting the second page where the page size is 1.
{
"data": {
"articles": [
{
"id": 1,
"title": "Aardvarks",
"author": {
"id": 1,
"name": "Sam"
}
}
]
}
}
Or these variables:
{ "sort": "-author.name", "offset": 1 }
For this response, sorted by author name descending and getting all articles except the first.
{
"data": {
"articles": [
{
"id": 1,
"title": "Aardvarks",
"author": {
"id": 1,
"name": "Sam"
}
},
{
"id": 2,
"title": "Emus",
"author": {
"id": 2,
"name": "Pat"
}
}
]
}
}
So, as you can see, I am using the formatResponse function for post processing to do the filtering/paging/sorting. .
So, my questions are:
To resolve this issue, GraphQL servers can paginate their list fields. This diagram shows offset-based pagination, in which a client requests a page based on an absolute index in the list ( offset ) and the maximum number of elements to return ( limit ).
Filtering is used in GraphQL root fields of node types (e.g. for file type it would be file and allFile ). The GraphQL filter argument is passed to the filtering system and will return all the nodes that match each of the given filters.
You can sort GraphQL query results using an index and a user-defined function. The following example uses the following components: A GraphQL schema named schema.
Results from your query can be sorted by using the order_by argument. The argument can be used to sort nested objects too. The sort order (ascending vs. descending) is set by specifying the asc or desc enum value for the column name in the order_by input object, e.g. {name: desc} .
Is this a valid use case? Is there a more canonical way to do filtering on deeply nested properties, along with sorting and paging?
Major part of original questing lies on segregating collections on different databases on separate microservices. In fact, it's nessasary to perform collection joining and subsequent filtering on some key, but it's directly impossible since there is no field in original collection to filter, sort or paginate.
Strightforward solution is perform full or filtered queries to original collections, and then perform joining and filtering result dataset on application server, e.g. by lodash, such at your solution. In is possible for small collections, but in general case causes large data transfer and unefficent sorting since there is no index structure - real RB-tree or SkipList, so with quadratic complexity it's not very good.
Dependent on resource volume on application server, special cache and index tables can be build there. If collection structure is fixed, some relations between collection entries and their fields can be reflected in special search table and update respectively on demain. It's like find & search index creation, but not it database, but on application server. Of cource, it will consume resources, but will be more fast than direct lodash-like sorting.
Also task can be solved from another side, if there is access to structure of original databases. Key is denormalization. In counter for classical relation approach, collections can have dublicate information for avioding further join operation. E.g., Articles collection can have some information from Authors collection, which is nessasary to perform filtering, sorting and pagination in further operations.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With