Reindex
TeamBlogDocs
By

Relay-compatible GraphQL pagination with MongoDB

Relay is an amazing framework for building modern applications cleanly and efficiently. Its built-in solution for client-side storage automatically handles many things that are otherwise difficult to implement. Relay is designed to be used with a GraphQL API; however, not all GraphQL APIs are Relay-compatible. To become compatible with Relay, a GraphQL API must implement several interfaces. In this article, I will demonstrate how to satisfy the Connection interface, which represents paginated data. I will be using MongoDB for backend storage, and Node.js with Express as a server.

In this example, I will demonstrate how to create a paginated news list API in GraphQL.

Relay Connections

Connection combines an abstraction of list-like data with pagination support. Because it is cursor-based rather than offset-based, it’s especially good for tasks like infinite scrolling and pull-to-refresh.

The Relay pagination system is based on the concept of a cursor, an opaque pointer to some element of a paginated result. Users can request all items that precede or succeed this element by passing the pointer to the connection arguments before or after. Users can also limit the result from the beginning or the end of the page with the arguments first and last, which accept integers.

Connections must include at least two fields: edges and pageInfo. Edges includes the fields cursor and node: a cursor for the element and an actual element’s data, respectively. pageInfo specifies whether there is more data after the last (or before the first) element of the resulting page.

The Relay docs describe the Connection interface in detail.

Basic project setup

I will not cover basic setup, which includes creating an npm project, installing Babel for ES2015 support, and organizing linting. The source code for this example is at GitHub.

Install the dependencies after basic setup. We will use Express middleware express-graphql to save us the time of implementing the endpoint ourselves.

npm install --save express graphql express-graphql mongodb

We will create our basic schema in src/Schema.js. We will start with the definition of an Article type for our newsfeed.

const Article = new GraphQLObjectType({
  name: 'Article',
  fields: () => ({
    id: {
      type: new GraphQLNonNull(GraphQLID),
      resolve(parent) {
        return parent._id.toString();
      },
    },
    text: {
      type: new GraphQLNonNull(GraphQLString),
    },
  }),
});

To convert the MongoDB ObjectID into a normal id, we defined a resolve function for our id. Relay requires node IDs to be globally unique. For simplicity, we will use plain MongoDB ObjectIDs in this tutorial. However, encoding the type name in your IDs may be useful for implementing the Node interface for Relay. One way to do it would be to concatenate the type name with your internal ID and then to encode it with Base64.

Besides the Article type, we need Edge and Connection types to satisfy the Connection interface. We won’t add actual code to their resolve functions yet. We will also add a PageInfo type for generic PageInfo information.

export const PageInfo = new GraphQLObjectType({
  name: 'PageInfo',
  fields: {
    hasNextPage: {
      type: new GraphQLNonNull(GraphQLBoolean),
    },
    hasPreviousPage: {
      type: new GraphQLNonNull(GraphQLBoolean),
    },
  },
});

const ArticleConnection = new GraphQLObjectType({
  name: 'ArticleConnection',
  fields: () => ({
    edges: {
      type: new GraphQLList(ArticleEdge),
      resolve() {
        return [];
      },
    },
    pageInfo: {
      type: new GraphQLNonNull(PageInfo),
    },
  }),
});

const ArticleEdge = new GraphQLObjectType({
  name: 'ArticleEdge',
  fields: () => ({
    cursor: {
      type: GraphQLString,
    },
    node: {
      type: Article,
    },
  }),
});

Finally, we need to add a connection to our schema. Relay doesn’t accept root fields that return Connections, so we need to wrap our field in a Viewer object, which will hold our Connection. Top-level Viewer nodes are common with Relay.

const Viewer = new GraphQLObjectType({
  name: 'Viewer',
  fields: () => ({
    id: {
      type: new GraphQLNonNull(GraphQLID),
    },
    allArticles: {
      type: ArticleConnection,
      resolve() {
        return {};
      },
    },
  }),
});

const Schema = new GraphQLSchema({
  query: new GraphQLObjectType({
    name: 'RootQueryType',
    fields: {
      viewer: {
        type: Viewer,
        resolve() {
          return {
            id: 'VIEWER_ID',
          };
        },
      },
    },
  }),
});

export default Schema

Now we can import this schema to our main file (src/index.js) and start our express server.

import express from 'express';
import graphqlHTTP from 'express-graphql';
import Schema from './Schema';

const app = express();

app.use('/graphql', graphqlHTTP({
  schema: Schema,
  graphiql: true,
}));

app.listen(3000);

If you navigate to http://localhost:3000/graphql, you should be able to query the GraphiQL interface.

{
  viewer {
    allArticles {
      edges {
        node {
          id
          text
        }
      }
    }
  }
}

As per our resolve function, edges will be returned as an empty list.

Adding MongoDB

Now we can add MongoDB to the fray. In GraphQL, we pass context information (such as database connections) in a context variable. Let’s modify src/index.js to pass a MongoDB connection to the GraphQL context.

import express from 'express';
import graphqlHTTP from 'express-graphql';
import { MongoClient } from 'mongodb';
import Schema from './Schema';

const app = express();
const mongodb = MongoClient.connect(
  'mongodb://localhost:27017/relaypagination'
);

app.use('/graphql', graphqlHTTP(async () => ({
  schema: Schema,
  graphiql: true,
  context: {
    mongodb: await mongodb,
  },
})));

app.listen(3000);

Now mongodb is available as a property of the third variable of the resolve function. We will perform an actual query in the ‘allArticles’ resolve and use a generic edge-creating resolve with the Connection. This will ensure that the connection will be able to return edges correctly as long as whatever returns the connection satisfies the API (the “object with field query” in this case).

const ArticleConnection = new GraphQLObjectType({
  name: 'ArticleConnection',
  fields: () => ({
    edges: {
      type: new GraphQLList(ArticleEdge),
      resolve(parent) {
        return parent.query.toArray();
      },
    },
  }),
});

const ArticleEdge = new GraphQLObjectType({
  name: 'ArticleEdge',
  fields: () => ({
    cursor: {
      type: GraphQLString,
    },
    node: {
      type: Article,
      resolve(parent) {
        return parent;
      }
    },
  }),
});

const Viewer = new GraphQLObjectType({
  name: 'Viewer',
  fields: () => ({
    id: {
      type: new GraphQLNonNull(GraphQLID),
    },
    allArticles: {
      type: ArticleConnection,
      resolve(parent, args, { mongodb }) {
        return {
          query: mongodb.collection('Articles'),
        };
      },
    },
  }),
});

Let’s insert some data into MongoDB and try the query in GraphiQL again.

$ mongo
> use relaypagination
> for (var i = 0; i < 1000; i++) { db.Articles.insert({text: i.toString()}) }

In the GraphiQL results, you should now be able to see 1000 items from MongoDB.

Adding first and last

Let’s abstract out our code a bit. In a real application, there can be many fields that require a certain type of Connection. In order to avoid repetition, it’s possible to abstract out the code in their resolve functions. Usually, this code lives in the Model application layer. Let’s create our Article model layer in src/Article.js.

export function getArticles(mongodb) {
  return mongodb.collection('Articles').find();
}

Now we can use this function in the resolve method of our field.

import { getArticles } from './Article';

...

const Viewer = new GraphQLObjectType({
  name: 'Viewer',
  fields: () => ({
    id: {
      type: new GraphQLNonNull(GraphQLID),
    },
    allArticles: {
      type: ArticleConnection,
      resolve(parent, args, { mongodb }) {
        return {
          query: getArticles(mongodb),
        };
      },
    },
  }),
});

Now we can add first and last parameters to getArticles and pass them from the allArticles arguments to the model layer.

first is a fairly straightforward limit, so we can simply use .limit in MongoDB. However, it order to determine whether there is a next page, we need to know the number of items in the entire query. last is trickier because we need to skip a number of items equal to the total minus the count. If both first and last are provided, we will apply first and then apply last to the result, provided there are enough items in it.

export async function getArticles(mongodb, { first, last }) {
  const query = mongodb.collection('Articles').find();
  const pageInfo = await applyPagination(
    query, first, last
  );
  return {
    query,
    pageInfo,
  }
}

async function applyPagination(query, first, last) {
  let count;

  if (first || last) {
    count = await query.clone().count();
    let limit;
    let skip;

    if (first && count > first) {
      limit = first;
    }

    if (last) {
      if (limit && limit > last) {
        skip = limit - last;
        limit = limit - skip;
      } else if (!limit && count > last) {
        skip = count - last;
      }
    }

    if (skip) {
      query.skip(skip);
    }

    if (limit) {
      query.limit(limit);
    }
  }

  return {
    hasNextPage: Boolean(first && count > first),
    hasPreviousPage: Boolean(last && count > last),
  };
}

The last step is to add the arguments to our connection. Let’s create a generic function that creates pagination parameters for connections.

export function createConnectionArguments() {
  return {
    first: {
      type: GraphQLInt,
    },
    last: {
      type: GraphQLInt,
    },
  };
}

const Viewer = new GraphQLObjectType({
  name: 'Viewer',
  fields: () => ({
    id: {
      type: new GraphQLNonNull(GraphQLID),
    },
    allArticles: {
      type: ArticleConnection,
      args: createConnectionArguments(),
      resolve (parent, args, { mongodb }) {
        return  getArticles(mongodb, args);
      },
    },
  }),
});

Let’s test these features in GraphiQL.

{
  viewer {
    allArticles(first:10) {
      edges {
        node {
          id
          text
        }
      }
      pageInfo {
        hasNextPage
      }
    }
  }
}

Now the result has been limited.

Cursors

Cursors are opaque objects used for pagination. You can encode different information inside a cursor as long as it is opaque to the API user. We are only going to store the item’s Base64-encoded id. The Cursor is going to be a String-like custom Scalar type that we will decode internally as an object with one field: value. Let’s create it in src/Cursor.js.

import Base64URL from 'base64-url';
import { GraphQLScalarType } from 'graphql';
import { Kind } from 'graphql/language';

export function toCursor({ value }) {
  return Base64URL.encode(value.toString());
}

export function fromCursor(string) {
  const value = Base64URL.decode(string);
  if (value) {
    return { value };
  } else {
    return null;
  }
}

const CursorType = new GraphQLScalarType({
  name: 'Cursor',
  serialize(value) {
    if (value.value) {
      return toCursor(value);
    } else {
      return null;
    }
  },
  parseLiteral(ast) {
    if (ast.kind === Kind.STRING) {
      return fromCursor(ast.value);
    } else {
      return null;
    }
  },
  parseValue(value) {
    return fromCursor(value);
  },
});

export default CursorType;

Now we can use this type in the Edge type as well in field arguments.

export function createConnectionArguments() {
  return {
    first: {
      type: GraphQLInt,
    },
    last: {
      type: GraphQLInt,
    },
    before: {
      type: Cursor,
    },
    after: {
      type: Cursor,
    },
  };
}

const ArticleEdge = new GraphQLObjectType({
  name: 'ArticleEdge',
  fields: () => ({
    cursor: {
      type: Cursor,
      resolve(parent) {
        return {
          value: parent._id.toString(),
        };
      },
    },
    node: {
      type: Article,
      resolve(parent) {
        return parent;
      },
    },
  }),
});

We can test whether the cursor is being returned in GraphiQL.

{
  viewer {
    allArticles(first:10) {
      edges {
        cursor
        node {
          id
          text
        }
      }
      pageInfo {
        hasNextPage
      }
    }
  }
}

Perfect! Now we need to update the model layer to use the cursors. There is no way to skip and limit with a value in MongoDB, so we will use a filter.

export async function getArticles(mongodb, { first, last, before, after }) {
  const query = limitQueryWithId(
    mongodb.collection('Articles').find(),
    before,
    after
  );
  const pageInfo = await applyPagination(
    query, first, last
  );
  return {
    query,
    pageInfo,
  }
}


function limitQueryWithId(query, before, after) {
  const filter = {
    _id: {},
  };

  if (before) {
    filter._id.$lt = ObjectId(before.value);
  }

  if (after) {
    filter._id.$gt = ObjectId(after.value);
  }

  return query.filter(filter);
}

If you take the value of an item’s cursor and pass it as after argument, you can see the next page of data.

{
  viewer {
    allArticles(first:10, after: "YOUR-CURSOR") {
      edges {
        cursor
        node {
          id
          text
        }
      }
      pageInfo {
        hasNextPage
      }
    }
  }
}

Ordering

Connection data is always ordered. Sometimes, the API even allows the ordering to be defined through additional arguments. We will cover only fixed ordering predefined inside the schema.

We will sort items by descending order of their text field. Our getArticles function will accept two new parameters: orderField and order. order can be 1 for ascending sort and -1 for descending.

When limiting ordering results with before and after, an object in the database should be in the result set if one of the following is true:

  • The Object is between items pointed by before and after cursors, non-inclusive.
  • The Object has the same sort field value as the item pointed to by before cursor, but is stably sorted to be after it.
  • The Object has the same sort field value as the item pointed to by after cursor, but is stably sorted to be before it.

In order to have stable sorting in the last two cases, we will always also sort the result by _id, in addition to any other ordering required. We will filter by _id in the last two cases.

In code, the path when sorting by just _id is simpler than sorting by other fields, so we will keep two separate code paths.

import { ObjectId } from 'mongodb';

export async function getArticles(mongodb, {
  first,
  last,
  before,
  after,
}, orderField, order) {
  let query = mongodb.collection('Articles');
  if (orderField === 'id') {
    query = limitQueryWithId(query, before, after, order);
  } else {
    query = await limitQuery(query, orderField, order, before, after);
  }
  const pageInfo = await applyPagination(
    query, first, last
  );
  return {
    query,
    pageInfo,
  }
}

function limitQueryWithId(query, before, after, order) {
  const filter = {
    _id: {},
  };

  if (before) {
    const op = order === 1 ? '$lt' : '$gt';
    filter._id[op] = ObjectId(before.value);
  }

  if (after) {
    const op = order === 1 ? '$gt' : '$lt';
    filter._id[op] = ObjectId(after.value);
  }

  return query.find(filter).sort([['_id', order]]);
}

async function limitQuery(query, field, order, before, after) {
  let filter = {};
  const limits = {};
  const ors = [];
  if (before) {
    const op = order === 1 ? '$lt' : '$gt';
    const beforeObject = await query.findOne({
      _id: ObjectId(before.value),
    }, {
      fields: {
        [field]: 1,
      },
    });
    limits[op] = beforeObject[field];
    ors.push(
      {
        [field]: beforeObject[field],
        _id: { [op]: ObjectId(before.value) },
      },
    );
  }

  if (after) {
    const op = order === 1 ? '$gt' : '$lt';
    const afterObject = await query.findOne({
      _id: ObjectId(after.value),
    }, {
      fields: {
        [field]: 1,
      },
    });
    limits[op] = afterObject[field];
    ors.push(
      {
        [field]: afterObject[field],
        _id: { [op]: ObjectId(after.value) },
      },
    );
  }

  if (before || after) {
    filter = {
      $or: [
        {
          [field]: limits,
        },
        ...ors,
      ],
    };
  }

  return query.find(filter).sort([[field, order], ['_id', order]]);
}

Now we can use these parameters in the schema.

const Viewer = new GraphQLObjectType({
  name: 'Viewer',
  fields: () => ({
    id: {
      type: new GraphQLNonNull(GraphQLID),
    },
    allArticles: {
      type: ArticleConnection,
      args: createConnectionArguments(),
      resolve(parent, args, { mongodb }) {
        return getArticles(mongodb, args, 'text', -1);
      },
    },
  }),
});

Now it should be possible to see the new ordering in GraphiQL.

A word about performance

These connections should perform especially well if you can create relevant indices. One should create an index for each possible sort order, including both ordering fields and _id in it. If your connections will support filtering, an index should be created for each possible combination of a filter and a sort order. When using filtering, it’s important to include the field being filtered into the MongoDB .sort, as this will advice MongoDB to use the relevant index.

However, even if individual queries are very efficient, the approach described in this article can be quite wasteful in some data modeling situations and shouldn’t be applied blindly. When the cardinality of Connections is low (for example, if every user has about 10-20 articles), then different approaches should be used. Consider this query to a similar API.

{
  viewer {
    allUsers(first: 10) {
      edges {
        node {
          articles(first:10) {
            node {
              id
              text
            }
          }
        }
      }
    }
  }
}

If the user won’t have many articles, ten index-backend queries is less efficient than one query that will simply fetch all articles for the given 10 users. The model layer can then be ordered and sliced. Although working with data in model layer rather than inside the database is counterintuitive, it may be a more efficient way to handle data.

Conclusion

We have created Relay-compliant pagination with GraphQL and MongoDB. The model layer is not often connected to the Article domain, so we can reuse it for different models by making collection an argument.

GraphQL is an amazing technology for making fast and reusable APIs, and Relay compatibility is one of the more difficult aspects of making them. I hope this article helps GraphQL API developers to make Relay-compatible APIs.

All the code used in this project is available at GitHub.

Written by
CTO & Co-founder

Reindex Blog