You don’t need GraphQL to query less data
Introduction
Imagine you’ve been working on the application for a long time. You’ve created many REST endpoints which are working fine, but you begin to notice that your queries are slowing down. You suspect it’s because you’re retrieving too much data from the server. You’ve heard about GraphQL and its way to query less data, but changing whole application structure to GraphQL is not best idea at that time, so you decide to remove redundant fields from the response, or create different endpoints for different client application view, yeah, it works, but it takes additional time to do that, it increases the number of bugs as you can forget to add some fields to some endpoints and so on. But is it really the only way? Let’s find out.
The problem
Let’s imagine that we have a simple REST endpoint which returns a list of users:
|
|
And we have a client application which uses this endpoint to display a list of users. The problem is that we don’t need all the fields from the response, we need only id
, name
, and email
. So we have two options:
- Remove redundant fields from the response.
- Create a new endpoint which will return only the fields we need.
First way is the easiest one, but there is one big disadvantage:
- At some point of the application you may need to grab
age
field was well as other fields.
Second way is a little harder, but it’s more flexible, but it also has a disadvantage:
- You need to create a new endpoint for every view of the client application.
- Once you add a new field to the user you need to add that to all endpoints where this field should be used.
- Usually the team who works on the client application is different from the team who works on the server, so you need to communicate with them to add new fields to the endpoints.
Is there another way to resolve this issue without these disadvantages? Certainly, let’s look into it.
The solution
The solution is to use a query parameter which allows us to specify which fields we want to get from the server. Let’s call it fields
, so now the request will look like:
- For one view:
|
|
- For another view:
|
|
Let’s define some rules and convention for that.
The rules
Here is the list of rules which, I believe, should be used for this approach:
- If the
fields
parameter is not specified, return all fields. - If the
fields
parameter is specified, return only the fields which are specified in the parameter. - If the
fields
parameter is specified and the field is not found, just ignore that (do not throw an error or return value for that). - If the
fields
parameter consists only of fields which are not found, return an empty object. - If the
fields
parameter includes nested fields which are not found, but the parent field is found, return the parent field with an empty object as a value. - If the
fields
parameter includes nested fields but the actual value is not an object or array, do not return anything for that field (see example #5).
The convention
The last thing we need to think about is the convention for the fields
parameter. I think that the best way is to have the fields
parameter a string with comma-separated fields.
Before we start, let’s define the object we will have examples with:
|
|
Nested fields should be represented in the same way, but inside the parentheses, for example:
|
|
The same logic applied to the arrays, for example:
|
|
Real examples
Let’s define the object we will have examples with:
|
|
Example 1
Pick root fields:
|
|
Response:
|
|
Example 2
Pick nested fields:
|
|
Response:
|
|
Example 3
Pick primitive fields from the array:
|
|
Response:
|
|
Example 4
Pick nested fields from the array:
|
|
Response:
|
|
Example 5
Pick nested fields from the primitive field:
|
|
Response:
|
|
Extending the pattern
The rules for the pattern described above covers 99.9% of all cases, but there can be cases where you need to extend the pattern. Here is the list of possible extensions:
- Regular expressions. For example, you want to get all fields which start with
name
or something like that. It’s not the best practice as it can be time-consuming as regular expressions are not the fastest thing in the world, but it can be useful in some cases. - Getting only the first (or last) N elements from the array. For example, you want to get only the first 5 elements from the array. It can be useful when you have a huge array and you want to get only the first elements to reduce the response size.
- Get the defined list of fields of all objects inside some field (kind of
instruments(*(id, name))
). - Filtering by some condition (like
order(*[ > 5])
), but it’s rare case as it’s better to add that filter option right to the API.
Useful Libraries
Node.js
The library json-mask is a good choice for Node.js applications, it fully covers all the rules and conventions described above (except extended cases).
The library express-partial-response uses package json-mask under the hood and has prepared middleware for Express.js.
Python
The library jsonmask is a good choice for Python applications, it fully covers all the rules and conventions described above (except extended cases).
The library django-rest-framework-queryfields has similar functionality, but it is designed for Django REST Framework.