Thursday, July 31, 2014

REST with HAL JSON - pagination when you are off range

Introduction to HAL

I have been working creating an API where you use HAL JSON to represent your JSON responses. To describe it briefly, the representation of a resource has a links section (it is although not required) and if it contains a list of resources it is included in an _embed element. Here's a list of libraries that can be used for expressing HAL. I have been using HAL Builder for Java in my project.

Like this example for a single element with embedded resources (taken from the HAL Primer with some modifications):

Asking for GET http://example.org/api/user/matthew
Response code will be 200 OK, and the body will be (and sorry for the line breaks, Blogger's fault):
{  
"_links": {      
"self": {          
},       
"contacts": {          
}
}  
"id": "matthew",
"name": "Matthew Weier O'Phinney"
    "_embedded": {
        "children": [
            {
                "_links": {
                    "self": {
                        "href": "http://example.org/api/user/mac_nibblet"
                    }
                },
                "id": "mac_nibblet",
                "name": "Antoine Hedgecock"
            },
            {
                "_links": {
                    "self": {
                        "href": "http://example.org/api/user/spiffyjr"
                    }
                },
                "id": "spiffyjr",
                "name": "Kyle Spraggs"
            }
        ],
        "website": {
            "_links": {
                "self": {
                    "href": "http://example.org/api/locations/mwop"
                }
            },
            "id": "mwop",
            "url": "http://www.mwop.net"
        },
    }
}
I have modified the example to show that you can have multiple links in the links section. I also have the "_embedded" section in there to show how you can include sub-resources. The response can be made much simpler and still be HAL, but you should have a links section. Some people have been arguing that this is optional, but in my opinion this is the major strength of HAL. But what if you want to paginate the results so that you don't for instance return a list of 4000 users, overloading the server and increasing response time?

Pagination in HAL

Take a look at the example provided by the HAL Primer. It includes a count and a page. You specify the page, and here it looks like you get a default set number of results set to 3.

The example doesn't say anything about specifying the count (and you can probably notice that the last page is incorrect when considering the total), but it would be reasonable to be able to specify the count in a query parameter, like so: "http://example.org/api/user?page=3&count=20" and get more elements in the "_embedded" section. You can then on the server side set a maximum count if you want (and you should perhaps do so).

A little note here is that supporting Range-headers in HTTP should also work fine. It is up to you.

Specifying a page that is off range

But what if you specify a page that is off range, for instance if total is 200 and you say "http://example.org/api/user?page=30&count=20"? This should give you element 600 to 620, which does not exist. There are several questions that arise on what to return, and I have been discussing there with my colleagues. The questions that arised was:
  • Should you return a response with 200 OK response code and a body containing of only the links section?
  • If so, what should be in the links section.
    •  Do you want to include the self-link to something that does not exist?
    • Should you have first and last links?
    • The previous link, should it be there and point to the same as the last link?
  • You can return an empty body or just {} (an empty JSON object). still with 200 OK
  • You can use the 204 no content response code
  • Should the count be what you provided or the number of elements returned, that would be zero?

My suggestion

I think the best solution is:
  • Provide the links section
  • Provide a self link (maybe not useful, but it looks better)
  • Provide the first and last link (so that you know what the range is and can easily get back within range)
  • There are no elements to show, so there is no reason to have an "embedded" section at all.
  • Have the fields "count" and "total there.
  • Have count set to zero. It is easier for the client to have a field there stating how many elements were returned (Also do so if you have less than the count to show, that is, if you are on the last page and the number of elements then are less than the count or if the collection simply is less than the count on page 1).
So a response on "http://example.org/api/user?page=30&count=20" in my world would be:

{
    "_links": {
        "self": {
            "href": http://example.org/api/user?page=30&count=20
        },
        "first": {
            "href": http://example.org/api/user?count=20
        },
        "last": {
            "href": http://example.org/api/user?count=20&page=10
        }
    }
    "count": 0,
    "total": 200
}

Tuesday, July 29, 2014

Automating tasks with Python

Today I had to create a large number of test data entities in JSON. Instead of cutting and pasting I decided to give Python a go and therefore I created a command line script to generate the test data I needed. I had to google everything I needed, as I have only seen Python a few times before, but I ended up creating a neat little script that works perfectly, although I could have used the language features for JSON probably.

Here is the script:

__author__ = 'Per-Oivin Andersen'

import getopt, sys, uuid, string, random, os


def usage():
    print "usage: provide count and number of orgunits and the file usage.xml will be generated."


def main():
    filename = "users.json"
    try:
        opts, args = getopt.getopt(sys.argv[1:], "hc:v", ["help", "count="])
    except getopt.GetoptError as err:
        # print help information and exit:
        print(err) # will print something like "option -a not recognized"
        usage()
        sys.exit(2)
    orgCount = None
    verbose = False
    for o, a in opts:
        if o == "-v":
            verbose = True
        elif o in ("-h", "--help"):
            usage()
            sys.exit()
        elif o in ("-c", "--count"):
            orgCount = a
        else:
            assert False, "unhandled option"
    if not orgCount.isdigit:
        usage()
        sys.exit()
    if int(orgCount) <= 0:
        usage()
        sys.exit()
    json ="["
    for num in range(1,int(orgCount)+1):
        id=uuid.uuid4()
        name=''.join(random.choice(string.ascii_letters) for _ in range(6))
        username=''.join(random.choice(string.ascii_letters) for _ in range(6))
        group=str(bool(random.getrandbits(1))).lower()
        json += "{" \
                "\n\"id\": \"%s\"," \
                "\n\t\"username\": \"%s\"," \
                "\n\t\"password\": \"password\"," \
                "\n\t\"isGroup\": %s," \
                "\n\t\"name\": \"%s\"" \
                "\n}" % (id,username,group,name)
        if num < int(orgCount):
            json += ",\n"
    json += "]"
    try:
        os.remove(filename)
    except OSError:
        pass
    file = open(filename, 'w+')
    file.write(json)
    print json
    print "\n\n JSON written to file " + filename

if __name__ == "__main__":
    main()

Friday, July 25, 2014

Those days when nothing works

Yesterday I had the greatest feeling when I solved som issues. Then it appeared that it was only working on one platform. My feeling of joy was abruptly taken away - and all of today has been one huge hell. After noticing this I started using the hg strip command on my local repository to revert some changes I made that also had broken some other things, and I ended up having to clone again, hence loosing all of my local changes. So today has been a zero day and days like this - were nothing works - feels awful. I just want to solve those issues with style and have them moved over to "done" on the board (before "done" we have "code review" and "test" so the work is not over until it has passed these).

Thursday, July 24, 2014

The kick I get from solving issues

There is a huge pleasure for me in creating new features and correcting bugs. I think I have always liked to work in a task-related manner. By task-related manner I mean that you have a list of tasks you are to finish and you cross them out or move them over to code review or done or whatever - because it isn't only in programming I like this. It goes way back to primary school, where we sometimes had a full day of solving tasks based on a todo-list instead of just solving problems in all eternity. We also got to take breaks when we wanted to.

So it fits me perfectly to be a developer in Scrum with a kanban.

Wednesday, July 23, 2014

The value of having someone who depends on your code

"Do you really say there a value in it??" you might say. Yes it is.

When I was studying I delivered assignments to the professor or my supervisor, and no one depended on the code I wrote. When you work on a bigger project, or like me; work on a framework, You have people who needs your classes, modules or projects to work and have the features they need to accomplish their goals.

When you write code and form your projects without anyone depending on it, and code and school projects no one is going to use, the work will never be tested in a real world scenario and you never get to know what the people using your code is going to need.

When a new module such as a REST API is planned before the implementation is done, the group or person designing it will have an idea of how to specify the API. But it will not be finished until there has been a real-world user of the API, someone creating a client application.

Friday, July 18, 2014

Testing before a release

The framework I am currently developing here at Computas supports different web application servers and different databases and we need to test them all. The framework provides demo applications that we manually test on each possible scenario before we can say it's ready. Also, we see that when building and running the software on different machines different things can happend from machine to machine. This is tedious work, clicking the same buttons over and over again.

I do not know much about testing yet, but we have been writing integration tests all the way through the project. We should probably have unit tests also, but we don't, at least not in the REST API where I am working. And I heard we have Selenium tests, which is automated GUI testing, but I have not seen them.

In the part of the API I have been implementing (with help from a colleauge in Romania) there was found a bug during testing by someone else. So I must go back and correct a bug, but that will be in the next release. We don't have the time to fix every bug we find - we just has to release. First we set a date for the release and now it is already overdue.

We have a new test leader but he just got employed so there has not been much changes yet to the testing procedures.

Wednesday, July 16, 2014

Current status in Computas AS

This is my first post, and I plan to blog about issues concerning my work on software development here at the norwegian Oslo based consultant company Computas. I don't believe this first post will be of any interest therefore I keep it short.

I have been working here for soon 5 months, started in March after I finished my master thesis in Bergen, where I also grew up. Currently I am working on a framework called Frame Solutions where I participate in designing and implementing the REST API on the Java platform (there is also a .NET version).

Today I have been doing a lot of manual GUI testing in our Javascript web application. I have also started looking at how we should implement pagination in our API and there I found some best practices online. Vinay Sahni on pagination in REST and restapitutorial.com - "Restful Best Practices" has been my inspiration and I look forward to implement it after we have done a final discussion on how the API should look.