At first glance, the design of a RESTful API seems very simple. “One noun, four verbs”, programming can be that simple. Briefly identify a resource, make sure that you enable its manipulation (CRUD operations) using endpoint and HTTP methods (POST, GET, PUT, DELETE) – and the REST API is ready. As an example, let’s imagine an interface where we can place, query, modify, and delete orders (Listing 1).
// Create a new order via http POST and JSON payload // representing the order to create. POST /api/orders [ ... payload representing an order ...] // Update an existing order with id 123 via http PUT // and JSON payload representing the changed order. PUT /api/orders/123 [ ... payload representing the changed order ...] // Delete an existing order with id 123 via http DELETE. DELETE /api/orders/123 // Retrieve an existing order with id 123 via http GET. GET /api/orders/123 // Retrieve all existing orders http GET. GET /api/orders
Is that really all there is to it? Does the 1:1 mapping of CRUD operations and HTTP methods always fit? What do parameterized resource queries (a.k.a. filters) for specifically limiting the hit set look like? How do I customize the return format, e.g. for use on mobile devices? How is security handled? What happens in the event of an error? How do I ensure the possible evolution of the API? Questions upon questions, but don’t worry. We will deal with the answers step by step.
STAY TUNED!
Learn more about API Conference
All beginnings are hard
The API is the developer’s UI. Similar to a good UX/UI design, care should be taken when designing an API to ensure that user expectations – in this case, the developer – are met. But what exactly do these expectations look like? The good news is that there are a number of specifications in the REST environment that can and should be used as a guide. This applies to the correct use of HTTP methods and HTTP status code. If you can’t find a specification for a problem, it doesn’t hurt to take a look at well-known and heavily used APIs (Facebook, Amazon, Twitter, GIT, etc.) and see what you can find. As a rule, you will find patterns and best practices that have become established in the REST community in the past few years.
POST vs. PUT vs. PATCH
As shown above, a 1:1 mapping of CRUD operations and HTTP methods is obvious and, in most cases, correct. However, according to the HTTP 1.1 specification, there are some subtleties to consider. Among other things, it states that POST creates a child resource at a URL defined by the server. In other words, the server creates a new resource during POST and assigns a unique ID for it, which is usually returned to the calling client as part of the new resource’s URL via a location header (Listing 2).
// POST request to create new order POST /api/orders [ ... payload representing an order ...] // Response with success code and location header HTTP/1.1. 201 Created Location: /api/orders/
But what if the client will specify the ID and URL used to identify the resource? This is where the HTTP method PUT comes into play. According to the specification, HTTP PUT replaces or creates a resource at a client-defined URL. The following call overwrites an existing order with ID 123, if available, or creates it anew with the ID specified by the client (Listing 3).
// PUT request to replace existing PUT /api/orders/123 [ ... payload representing an order ...] // Response with success code and location header // 200 OK, if existing order 123 replaced // 201 Created, if new order 123 created HTTP/1.1. 200 OK oder 201 Created Location: /api/orders/
The HTTP method PUT can perform several tasks. It is a question of the API design whether the client-side assignment of an ID should be allowed for the API and if PUT should be implemented. In addition to PUT, there is another, lesser-known HTTP method for modifying resources: PATCH. PUT leads to a complete replacement of the resource, so the payload must always contain the entire resource. However, PATCH only results in an update according to changes specified within the payload. For example, with PUT, if you want to change the “Comment” field in an order, the entire order including the changed “Comment” field would have to be transferred in the payload. With PATCH, only the changed field or a corresponding description of the change is sufficient as a payload. What sounds attractive at first glance is not so easy to implement in practice. Depending on the payload format, evaluation of the change requests and the subsequent execution can become quite complex.
Safe and idempotent
The HTTP 1.1 specification has other special features that should be taken into account when designing a RESTFul API. It characterizes some of its methods as safe (GET, HEAD and OPTIONS) and/or idempotent (GET, HEAD, OPTIONS, PUT and DELETE). safe methods must not modify a resource or its representation. In other words: Reading access to a resource via HTTP GET should never trigger a writing side effect. The following calls – à la RPC – to manipulate an order are taboo:
- GET /api/orders?add=item …
- GET /api/orders/1234/addItem …
Change only once and delete only once? Isn’t that obvious? Not necessarily. If you were to implement PUT in such a way that it adds a new product to an existing order, then multiple calls would add multiple products. That is exactly what is prohibited. According to specification, the payload of PUT completely replaces the previous representation of the resource. Once replaced, further calls do not change anything and always return the same HTTP status code as the first call (200 for OK or 204 for No Content).
So far, so good. But what does idempotent mean in the context of DELETE? Should multiple calls always return the same HTTP status code? Not necessarily. idempotent only means that the resource status on the server may only change once in the case of multiple calls and that renewed calls should not trigger any further page effects. idempotent, on the other hand, makes no statements about the return value. Thus, a second deletion attempt may return a different HTTP status code (e.g., 404 for Not Found) than the first to signal to the client that the call does not make sense.
By the way, the PATCH method mentioned above is neither safe nor idempotent. This is because its payload is freely definable. Not only individual fields of the resource to be changed are allowed, but also change statements. For example, a call to PATCH with the RFC 6902 JSON structure shown in Listing 3 would overwrite the item with ID 123 and add a second one to the order. A multiple call would correspondingly result in multiple additions of the item to the order:
But what if, GET returns the resource and updates a lastAccessed field in the database at the same time? As long as this manipulation only affects representation within the server-side domain or database, it is allowed. If the field is passed to the outside via the interface – we’ll ignore what this means for a moment – then the method is no longer safe and the user’s expectation is not fulfilled. Admittedly, the above example seems a bit contrived. With the view of the idempotent methods, it’s a little more practical. The rule applies that multiple applications of the method may lead to unique changes in resource representation. Multiple calls of PUT may change the resource or its representation only once, and multiple calls of DELETE may delete the resource only once.
[
{"op": "add", "path":"/items", "value": [ ... some item data ... ]},
{"op": "replace", "path":"/items/123", "value": [ ... item 123 data ... ]}
]
Filter and Co.
Querying individual or all resources of a type is very simple in REST. But what if you want to set a specific filter or limit for a query? After all, as a client, you don’t necessarily want to load the entire database over the wire with one query. And even if this is the client’s intention, the server should prevent this and artificially limit the number of hits.
Although there is no concrete specification for query filtering, some best practices have been established in the REST community. In the simplest variant, one specifies the fields to be filtered and their values as query parameters. For example, the queries shown in Listing 4 would return all open orders, all orders over 100 euros, or the first ten orders.
// Ask for all open orders GET /api/orders?state=open // Ask for all orders more expensive // than 100.00 Euro GET /api/orders?state=open&amount>=100.00 // Ask for the first 10 orders GET /api/orders?state=open&offset=0&limit=10 //offset may be optional if 0
Of course, the whole thing could be combined in any way by concatenating the individual filters. Sorting the result set is possible by specifying a specific query parameter (e.g. sort) and the desired sorting (e.g. +/- or asc/desc): GET /api/orders?sort=-status,+price.
If you don’t want to receive all order fields as a result, you need an additional query parameter (e.g. fields for fields to be included or exclude for fields to be ignored):
- GET /api/orders?fields=id,status,price,date
- GET /api/orders?exclude=id,date
What still looks simple can quickly become complex and nearly impossible to handle. Let’s imagine that we want to allow both an “and” and an “or” in the combined query. We also want to be able to combine groups of filters and link them arbitrarily via “and” or “or” while limiting the return set of fields for mobile devices. How would the URL for “all orders a) of the last two days AND with a goods value over 100 euros OR the status open OR in process OR b) with a goods value under 100 euros AND today’s order date AND the status open, sorted by DATE, STATUS and GOODS VALUE” look formatted for the mobile version? With such complex queries, one is quickly tempted to create a query language – including parser – and reinvent the wheel. The first remedy, at least for standard queries, is aliases for both predefined filter and field combinations. These can be understood as virtual resources. Alternatively, the desired field restriction can also be signaled via HTTP header (Prefer) (Listing 5).
// Use virtual resource open_orders as // an alias for /api/orders?status=open GET /api/open_orders // Use style=mobile parameter as an alias // for a predefined set of field filters GET /api/orders?style=mobile // Use Prefere-Header return=mobile-format // as an alias for a predefined set of field filters // in combination with open_orders alias GET /api/open_orders Prefer: return=mobile-format
A much more flexible alternative is the RQL (Resource Query Language), which is based on FIQL (Feed Item Query Language). With the help of this object-style query language and associated parsers, any queries can be mapped to existing resources and executed directly on JS arrays, SQL, MongoDB, or Elasticsearch. Thanks to Java parsers and JPA Criteria Builder, a simple conversion to a JPA query is also possible. For complex queries including filters on individual resources, RQL is an optimal choice.
But even this variant eventually reaches its limit. Namely, whenever a query will take place not only on one resource, but joining several resources. As a rule, this problem leads to the client sending queries in several round trips. If we want to query all orders of the last week from all stores located in a certain zip code, first we must query the matching stores and then their orders, per store. Merging the results happens accordingly on the client—the classic N+1 dilemma. Now, we should look for a viable alternative—beyond REST. GraphQL—originated by Facebook and later donated to the GraphQL Foundation—is worth a look! With the help of GraphQL, arbitrary queries can be made on an object graph and the desired result fields of the involved resources can be combined and queried in a targeted manner. Regardless of the number of resources, the result can be queried with a single round trip.
Pagination
If a client requests a potentially large amount of resources, for example, all orders, then the request should be restricted from the outset. Usually, pagination is used for this purpose. In general, this can happen in two different ways (Listing 6). In the first variant, the client specifies a concrete page number within the request, e.g., as a query parameter. This implies that the server determines how many hits should exist per page and calculates the offset accordingly. In the second variant, the client passes both offset and the maximum number of resources to be transferred. This requires a little more thought but brings significantly more flexibility in return.
// Pagination variant 1: // Use concrete page numbers, calculate // offset and limit on client side GET /api/open_orders?page=3 // Pagination variant 2: // Calculate offset and limit on client // side and use values as query parameter GET /api/orders?offset=30,limit=10
Pagination is always interesting when navigating through larger amounts of data. For example, let’s imagine a table within a Single Page Application (SPA), which we can scroll back and forth. It should also be possible to jump to the first and last page of the table. In order to calculate the links of the corresponding navigation buttons correctly, the client needs some information. Wouldn’t it be nice if this could be done by the server? No problem. For this purpose, the server only has to send corresponding link references in response. These can be generated either as part of the payload or alternatively, as link headers (Listing 7).
// get "page 3" and info about PREV/NEXT GET /api/orders?offset=20&limit=10 HTTP/1.1 // Response with success code and link header for // navigation purpose HTTP/1.1. 206 Partial Content Link: <.../api/orders?offset=0&limit=10>; rel="first" <.../api//orders?offset=10&limit=10>; rel="prev", <.../api//orders?offset=30&limit=10>; rel="next", <.../api//orders?offset=40&limit=3>; rel="last"
Header
As previous examples showed, HTTP headers play a very special role in the REST environment. But what belongs in the body and what in the header? When should a path or query parameter be used within the URL and when should a header be used?
In general, a header should be used for global metadata and the body for business- or request-specific information. The same applies to the parameters. While path or query parameters should be used for resource-specific parameters, the HTTP header is used to exchange general metadata. For example, the Path parameter can be used to specify a subresource or resource ID, and the Query parameter can be used to specify a specific filter, including the filter value. The header, on the other hand, contains information about the exchange format (Accept, Content-Type), security (Authorization header) or – as already seen – about possible further actions (Link header). Incidentally, the use of HTTP headers offers an advantage that the header values can be accessed specifically without having to parse the entire payload.
Of course, in addition to using standardized headers, it is also possible to specify custom headers in order to exchange self-defined metadata of one’s own interface. In the REST community, an “X-” is usually used as a prefix for such custom headers to indicate that it is not a standardized header. However, the use of the “X-” prefix has been considered deprecated since 2012. The reason for this is that a later conversion of the individual header to a standard, and the removal of the “X-” prefix, would break backward compatibility. A good example of this is the GZIP header, which currently must be supported by clients and servers in both x-gzip and gzip forms. Now, any RESTful API designer can ask how likely it is that a proprietary header of their own API will ever make it into an open standard, and decide whether to use the “X-” prefix or not.
Status codes
The targeted application of the available HTTP status codes is at least as important and helpful as the correct use of HTTP methods and headers. It is no coincidence that in addition to the three most frequently encountered codes 200 (OK), 400 (Bad Request), and 500 (Internal Server Error), there is a whole list of other useful codes.
First of all, care should be taken that the server returns a code with the correct number range. While the group of 100 status codes (information) indicates that the processing of the request is still ongoing, the 200s (successful operations) signal correct processing of the request. The group of 300 status codes (redirection), on the other hand, indicate to the client that further client steps are necessary to successfully process the request. The group of 400 status codes (client error) indicate problems with the request, while the 500 (server error) signal that the server is unable to process the request in a meaningful way. By selecting the correct number group, the client is shown whether repeating the request (possibly with changed request data) makes sense.
When using status codes, the focus should be on the expectations of the API user. If the user queries the list of all open orders and receives an empty list and a status code 200 as a response, they cannot be sure whether the payload is deliberately empty or whether there is an error on the server-side. The situation is different if the server returns an empty list and the code 204 (No Content). In this case, the API user is deliberately deprived of any room for interpretation – i.e. potential sources of error. The same applies if the API user only requests a limited subset via pagination. In this case, the server should deliver the code 206 (Partial Content) instead of 200, signaling that there are still more hits waiting on the server-side and that their links can be found in the header or the payload. If the API user creates a new resource, the server should acknowledge this with 201 (Created) instead of just 200. Again, the specific code is much more meaningful than the general code. If the server cannot process the request directly, i.e. synchronously, but can only trigger its processing asynchronously, this should be signaled by code 202 (Accepted). This way, the client knows that the server has not yet necessarily made a change to the resource, but that this will take place in any case.
In addition to the codes from the 200 range shown so far, various codes from the other ranges should also be used specifically. For example, a 401 (Unauthorized) or 403 (Forbidden) has a completely different meaning than the generic error code 400. The same applies to 404 (Not Found), 405 (Method Not Allowed), and 429 (Too Many Requests).
A really good interactive overview of the individual HTTP status codes including their meaning and potential application scenarios can be found on the pages of Talend (originated by Restlet).
Good API Design is crucial for your success
Explore the API Design Track
Caching and security
In addition to the topics shown so far, there are many more aspects to consider when designing a “good” API. For example, a mature caching concept helps avoid outgoing client requests or, alternatively, serves them from their cache through upstream content delivery networks, proxies, or other servers. The means to this end is an on-board feature of HTTP. While in HTTP 1.0 the caching behavior can only be controlled roughly via the Expires header, in HTTP 1.1 the Cache-Control header allows considerably more options. In addition, it is possible to send a conditional GET to the server via the If-Modified-Since header (plus Last-Modified) or If-None-Match header (plus ETag). If there are no newer resources available on the server, it responds with the status code 304 (Not Modified) and an empty body. Otherwise, the request is handled like a normal GET.
The topic of security must also be considered in the REST environment. This is especially true when systems that were previously only used internally are opened up to the outside via API. Since a RESTful service should be stateless by definition and there is no session on the server-side, the challenge arises as to how the information required for authentication and authorization can be provided within the requests without the RESTful submitting new authentication requests to an authentication service. One possible, well-established solution is for the client to authenticate once at an authentication server and receive a time-limited signed token from there (Fig. 1).
This token is then sent to the RESTful service within the authentication header, where it can be verified for validity. For this to work, the authentication server and the RESTful service have exchanged a public key in advance. If a JSON Web Token (JWT) is used as the token, additional data can be stored within the token as key-value pairs in the form of so-called claims (Fig. 2). This is important in the environment of REST-based microservices, since general data, such as user roles, can be passed easily and efficiently through the microservices involved in a request.
Academic vs. Pragmatic REST
As previously shown, it is not that hard to design a good RESTful API. The important thing is not to try to reinvent the wheel, but to rely on standards and established patterns and best practices so as to meet the user’s expectations. It is also important that once design decisions have been made, they are used consistently within the API. But why do heated discussions about the correct use of REST still occur time and again?
Few developers of RESTful APIs have ever taken a deeper look at the dissertation by Roy Thomas Fielding from the year 2000. In chapter 5, Fielding describes a “new” approach for network-based software architectures called REST (Representational State Transfer). The interesting thing is that there is little in the dissertation about endpoints, HTTP methods, or even JSON or XML. Instead, the thesis is about an architectural approach characterized by terms like client-server, stateless, caching, uniform interfaces, layered system, and code on demand. What most of us understand by REST, namely the identification of a resource by means of a unique URL as well as its manipulation by a representation of the resource (JSON or XML) in interaction with self-explanatory messages (a.k.a. HTTP methods) plays a subordinate role here and is only referred to there as one of several points with “the four interface constraints”. An attentive reader will pause and ask: “But I only see three interface constraints so far”:
- Identification via URL
- Manipulation via representation (e.g. JSON)
- Self-describing messages (HTTP methods)
This is precisely the crux of the matter. Fielding lists “Hypermedia as the engine of application state” (HATEOAS) as the fourth constraint. The following two quotes from Fielding show the importance he personally attaches to the terms hypermedia and hypertext in the context of REST:
- “If the engine of application state (and hence the API) is not driven by hypertext, then it cannot be RESTful and cannot be a REST API”.
- “A REST API should be entered with no prior knowledge beyond the initial URI … From that point on, all application state transitions must be driven by the client selection of server-provides choices …”
Therefore, Fielding says that a RESTful API should be usable without prior knowledge beyond the initial URL. Calling this URL returns a list of links (as part of the payload or a link header) with possible and meaningful operations.
HATEOAS
Admittedly, this scenario probably sounds a bit abstract at first. But it is exactly what we encounter on the Internet every day, isn’t it? REST is nothing more than an abstraction of the behavior we are familiar with on the World Wide Web. We call up a URI and receive – in addition to one or the other piece of information – a list of links that show us which operations are possible or permitted at this exact moment.
We have already become familiar with this procedure in the navigation via pagination outlined above. If you request a section of a larger data set from an API, link references allow you to navigate within the data set. The API user only knows the semantics of the references in advance – prev corresponds to back, next corresponds to forward – but not the concrete URLs. There is even a standard since 2010, RFC 5988, which specifies link names and their semantics accordingly. So please do not reinvent the wheel, but use one of the link names listed there if possible.
When navigating through a large amount of data, it is still conceivable that the API can provide generic links for forward, back, first, or last page. But how will this work for the normal resources of a RESTful API? Let’s revisit our example for placing, modifying, deleting, and querying orders. The initial call to the only URI we know http://api.myshops.com/ would give us an HTTP response with the code 204 (No Content), as well as a link reference with the name edit and the URL http://api.myshops.com/orders. Based on the above RFC, we know that we can place an order with this URL (via HTTP POST).
It is crucial that we already know that we can handle orders via the store API and that we already know the necessary JSON format for it. However, we do not know the specific URLs to use and it doesn’t really matter. If we now place an order using the returned URL, we receive a confirmation (204 No Content or 200 OK) and link references that tell us what we can do with the order we just placed.
One of the link references is typically self with a reference to the created resource. In our case, this would be http://api.myshops.com/orders/123. With the help of this link, we can query the order status at any time. Another link reference called payment – also part of RFC 5988 – could signal how to pay for the open order: http://api.myshops.com/payments/123. Depending on the action currently being performed and the associated server-side changes to the application state, the links guide us step by step through the application or, in our case, through possible use cases of the ordering process. The Hypertext Application Language (HAL) has become established in recent years for exchanging necessary reference information. If the whole thing is still a bit too abstract for you, you can try the HATEOAS demo from Heroku, including a generic HAL browser.
If it’s not REST…
Let’s be honest: Who goes as far as the above scenarios in their RESTful APIs? I think very few of us. But how should we classify our own API on the REST scale? Can I even use the word REST in connection with my API without risking a shitstorm from the REST purists?
Leonard Richardson’s Maturity Model can be used as a small litmus test for classifying one’s own web service. In his model, Richardson classifies web services into three different REST maturity levels depending on their support of URIs, HTTP and hypermedia (Fig. 3). Actually, there are four, since below the three legitimate levels there is still a level 0, which only sends XML or JSON via HTTP POST over the line and corresponds more to the classic RPC model. This level is also often referred to as RESTless, since it has virtually nothing in common with REST.
In Level 1, Richardson introduces the resources known from REST and different endpoints per resource. In an application, there are usually several URIs in this model. However, only one HTTP method (usually POST) is still used and the extended possibilities of the HTTP protocol, such as headers, return codes, caching, etc., are also dispensed with. Level 2 starts exactly where level 1 ends. Operations on the resources are assigned to the different HTTP methods (query the resource via GET, create via POST or PUT, modify via PUT, partially modify via PATCH and delete via DELETE). The HTTP status codes are used sensibly to signal what happened on the server with the resources – or not. It is only in level 3 – aka “the glory of REST” – where Richardson introduces hypermedia and a self-explanatory system.
STAY TUNED!
Learn more about API Conference
Conclusion
O.k., so let’s agree that – according to pure doctrine – probably very few of us have climbed the holy REST Olympus. But is that really such a bad thing? The important thing, after all, is that the end result is a talking API that is understandable to the user, i.e., the developer, and brings with it an appropriate level of stability. Vinay Sahni, founder of Enchant, wrote about this in his blog: “An API is a developer’s UI – just like any UI, it’s important to ensure the user’s experience is thought out carefully!” It’s hard to put it any more aptly than that.
But am I allowed to call my API RESTful at all, if I have only reached a level below 3 according to Richardson? Personally, I am more pragmatic than religiously inclined. If my API meets the conditions of Level 2, then I would definitely call it RESTful, but not if it meets Level 1.
But what if I encounter a self-confessed Level 3 representative who accepts no truth besides his own? Or to put it another way, should I insist on having designed a RESTful API come hell or high water in such a situation? Maybe we just lack a term for the formula RESTful minus HATEOAS? How about RESTalike?