Thursday, July 31, 2014

REST with HAL JSON - pagination when you are off range

Introduction to HAL

I have been working creating an API where you use HAL JSON to represent your JSON responses. To describe it briefly, the representation of a resource has a links section (it is although not required) and if it contains a list of resources it is included in an _embed element. Here's a list of libraries that can be used for expressing HAL. I have been using HAL Builder for Java in my project.

Like this example for a single element with embedded resources (taken from the HAL Primer with some modifications):

Asking for GET http://example.org/api/user/matthew
Response code will be 200 OK, and the body will be (and sorry for the line breaks, Blogger's fault):
{  
"_links": {      
"self": {          
},       
"contacts": {          
}
}  
"id": "matthew",
"name": "Matthew Weier O'Phinney"
    "_embedded": {
        "children": [
            {
                "_links": {
                    "self": {
                        "href": "http://example.org/api/user/mac_nibblet"
                    }
                },
                "id": "mac_nibblet",
                "name": "Antoine Hedgecock"
            },
            {
                "_links": {
                    "self": {
                        "href": "http://example.org/api/user/spiffyjr"
                    }
                },
                "id": "spiffyjr",
                "name": "Kyle Spraggs"
            }
        ],
        "website": {
            "_links": {
                "self": {
                    "href": "http://example.org/api/locations/mwop"
                }
            },
            "id": "mwop",
            "url": "http://www.mwop.net"
        },
    }
}
I have modified the example to show that you can have multiple links in the links section. I also have the "_embedded" section in there to show how you can include sub-resources. The response can be made much simpler and still be HAL, but you should have a links section. Some people have been arguing that this is optional, but in my opinion this is the major strength of HAL. But what if you want to paginate the results so that you don't for instance return a list of 4000 users, overloading the server and increasing response time?

Pagination in HAL

Take a look at the example provided by the HAL Primer. It includes a count and a page. You specify the page, and here it looks like you get a default set number of results set to 3.

The example doesn't say anything about specifying the count (and you can probably notice that the last page is incorrect when considering the total), but it would be reasonable to be able to specify the count in a query parameter, like so: "http://example.org/api/user?page=3&count=20" and get more elements in the "_embedded" section. You can then on the server side set a maximum count if you want (and you should perhaps do so).

A little note here is that supporting Range-headers in HTTP should also work fine. It is up to you.

Specifying a page that is off range

But what if you specify a page that is off range, for instance if total is 200 and you say "http://example.org/api/user?page=30&count=20"? This should give you element 600 to 620, which does not exist. There are several questions that arise on what to return, and I have been discussing there with my colleagues. The questions that arised was:
  • Should you return a response with 200 OK response code and a body containing of only the links section?
  • If so, what should be in the links section.
    •  Do you want to include the self-link to something that does not exist?
    • Should you have first and last links?
    • The previous link, should it be there and point to the same as the last link?
  • You can return an empty body or just {} (an empty JSON object). still with 200 OK
  • You can use the 204 no content response code
  • Should the count be what you provided or the number of elements returned, that would be zero?

My suggestion

I think the best solution is:
  • Provide the links section
  • Provide a self link (maybe not useful, but it looks better)
  • Provide the first and last link (so that you know what the range is and can easily get back within range)
  • There are no elements to show, so there is no reason to have an "embedded" section at all.
  • Have the fields "count" and "total there.
  • Have count set to zero. It is easier for the client to have a field there stating how many elements were returned (Also do so if you have less than the count to show, that is, if you are on the last page and the number of elements then are less than the count or if the collection simply is less than the count on page 1).
So a response on "http://example.org/api/user?page=30&count=20" in my world would be:

{
    "_links": {
        "self": {
            "href": http://example.org/api/user?page=30&count=20
        },
        "first": {
            "href": http://example.org/api/user?count=20
        },
        "last": {
            "href": http://example.org/api/user?count=20&page=10
        }
    }
    "count": 0,
    "total": 200
}

No comments:

Post a Comment