RESTful Content

One of the stories from the early days of the web is how search engines wiped out entire websites. When dynamic web sites were still a new concept, developers didn't appreciate the difference between a GET and POST request. As a result, they created pages- accessed with the GET method- that would delete pages. When search engines started crawling these sites, they could wipe out all the content.

If these web developers had followed the HTTP spec properly, this would not have happened. A GET request is supposed to cause no side effects (you know, like wiping out a site). Recently, there has been a move in web development to properly embrace Representational State Transfer, also known as REST. This chapter describes the RESTful features in Yesod and how you can use them to create more robust web applications.

Request methods

In many web frameworks, you write one handler function per resource. In Yesod, the default is to have a separate handler function for each request method. The two most common request methods you will deal with in creating web sites are GET and POST. These are the most well-supported methods in HTML, since they are the only ones supported by web forms. However, when creating RESTful APIs, the other methods are very useful.

Technically speaking, you can create whichever request methods you like, but it is strongly recommended to stick to the ones spelled out in the HTTP spec. The most common of these are:

GET

Read-only requests. Assuming no other changes occur on the server, calling a GET request multiple times should result in the same response, barring such things as "current time" or randomly assigned results.

POST

A general mutating request. A POST request should never be submitted twice by the user. A common example of this would be to transfer funds from one bank account to another.

PUT

Create a new resource on the server, or replace an existing one. This method is safe to be called multiple times.

DELETE

Just like it sounds: wipe out a resource on the server. Calling multiple times should be OK.

To a certain extent, this fits in very well with Haskell philosophy: a GET request is similar to a pure function, which cannot have side effects. In practice, your GET functions will probably perform IO, such as reading information from a database, logging user actions, and so on.

See the routing and handlers chapter chapter for more information on the syntax of defining handler functions for each request method.

Representations

Suppose we have a Haskell datatype and value:

data Person = Person { name :: String, age :: Int }
michael = Person "Michael" 25

We could represent that data as HTML:

<table>
    <tr>
        <th>Name</th>
        <td>Michael</td>
    </tr>
    <tr>
        <th>Age</th>
        <td>25</td>
    </tr>
</table>

or we could represent it as JSON:

{"name":"Michael","age":25}

or as XML:

<person>
    <name>Michael</name>
    <age>25</age>
</person>

Often times, web applications will use a different URL to get each of these representations; perhaps /person/michael.html, /person/michael.json, etc. Yesod follows the RESTful principle of a single URL for each resource. So in Yesod, all of these would be accessed from /person/michael.

Then the question becomes how do we determine which representation to serve. The answer is the HTTP Accept header: it gives a prioritized list of content types the client is expecting. Yesod will automatically determine which representation to serve based upon this header.

Let's make that last sentence a bit more concrete with some code:

type ChooseRep = [ContentType] -> IO (ContentType, Content)
class HasReps a where
    chooseRep :: a -> ChooseRep

The chooseRep function takes two arguments: the value we are getting representations for, and a list of content types that the client will accept. We determine this by reading the Accept request header. chooseRep returns a tuple containing the content type of our response and the actual content.

This typeclass is the core of Yesod's RESTful approach to representations. Every handler function must return an instance of HasReps. When Yesod generates the dispatch function, it automatically applies chooseRep to each handler, essentially giving all functions the type Handler ChooseRep. After running the Handler and obtaining the ChooseRep result, it is applied to the list of content types parsed from the Accept header.

Yesod provides a number of instances of HasReps out of the box. When we use defaultLayout, for example, the return type is RepHtml, which looks like:

newtype RepHtml = RepHtml Content
instance HasReps RepHtml where
    chooseRep (RepHtml content) _ = return ("text/html", content)

Notice that we ignore entirely the list of expected content types. A number of the built in representations (RepHtml, RepPlain, RepJson, RepXml) in fact only support a single representation, and therefore what the client requests in the Accept header is irrelevant.

RepHtmlJson

An example to the contrary is RepHtmlJson, which provides either an HTML or JSON representation. This instance helps greatly in programming AJAX applications that degrade nicely. Here is an example that returns either HTML or JSON data, depending on what the client wants.

{-# LANGUAGE QuasiQuotes, TypeFamilies, OverloadedStrings #-}
{-# LANGUAGE MultiParamTypeClasses, TemplateHaskell #-}
import Yesod
data R = R
mkYesod "R" [parseRoutes|
/ RootR GET
/#String NameR GET
|]
instance Yesod R

getRootR = defaultLayout $ do
    setTitle "Homepage"
    addScriptRemote "http://ajax.googleapis.com/ajax/libs/jquery/1.4/jquery.min.js"
    toWidget [julius|
$(function(){
    $("#ajax a").click(function(){
        jQuery.getJSON($(this).attr("href"), function(o){
            $("div").text(o.name);
        });
        return false;
    });
});
|]
    let names = words "Larry Moe Curly"
    [whamlet|
<h2>AJAX Version
<div #results>
    AJAX results will be placed here when you click #
    the names below.
<ul #ajax>
    $forall name <- names
        <li>
            <a href=@{NameR name}>#{name}

<h2>HTML Version
<p>
    Clicking the names below will redirect the page #
    to an HTML version.
<ul #html>
    $forall name <- names
        <li>
            <a href=@{NameR name}>#{name}

|]

getNameR name = do
    let widget = do
            setTitle $ toHtml name
            [whamlet|Looks like you have Javascript off. Name: #{name}|]
    let json = object ["name" .= name]
    defaultLayoutJson widget json

main = warpDebug 4000 R

Our getRootR handler creates a page with three links and some Javascript which intercept clicks on the links and performs asynchronous requests. If the user has Javascript enabled, clicking on the link will cause a request to be sent with an Accept header of application/json. In that case, getNameR will return the JSON representation instead.

If the user disables Javascript, clicking on the link will send the user to the appropriate URL. A web browser places priority on an HTML representation of the data, and therefore the page defined by the widget will be returned.

We can of course extend this to work with XML, Atom feeds, or even binary representations of the data. A fun exercise could be writing a web application that serves data simply using the default Show instances of datatypes, and then writing a web client that parses the results using the default Read instances.

News Feeds

A great, practical example of multiple representations if the yesod-newsfeed package. There are two major formats for news feeds on the web: RSS and Atom. They contain almost exactly the same information, but are just packaged up differently.

The yesod-newsfeed package defines a Feed datatype which contains information like title, description, and last updated time. It then provides two separate sets of functions for displaying this data: one for RSS, one for Atom. They each define their own representation datatypes:

newtype RepAtom = RepAtom Content
instance HasReps RepAtom where
    chooseRep (RepAtom c) _ = return (typeAtom, c)
newtype RepRss = RepRss Content
instance HasReps RepRss where
    chooseRep (RepRss c) _ = return (typeRss, c)

But there's a third module which defines another datatype:

data RepAtomRss = RepAtomRss RepAtom RepRss
instance HasReps RepAtomRss where
    chooseRep (RepAtomRss (RepAtom a) (RepRss r)) = chooseRep
        [ (typeAtom, a)
        , (typeRss, r)
        ]

This datatype will automatically serve whichever representation the client prefers, defaulting to Atom. If a client connects that only understands RSS, assuming it provides the correct HTTP headers, Yesod will provide RSS output.

Other request headers

There are a great deal of other request headers available. Some of them only affect the transfer of data between the server and client, and should not affect the application at all. For example, Accept-Encoding informs the server which compression schemes the client understands, and Host informs the server which virtual host to serve up.

Other headers do affect the application, but are automatically read by Yesod. For example, the Accept-Language header specifies which human language (English, Spanish, German, Swiss-German) the client prefers. See the i18n chapter for details on how this header is used.

Stateless

I've saved this section for the last, not because it is less important, but rather because there are no specific features in Yesod to enforce this.

HTTP is a stateless protocol: each request is to be seen as the beginning of a conversation. This means, for instance, it doesn't matter to the server if you requested five pages previously, it will treat your sixth request as if it's your first one.

On the other hand, some features on websites won't work without some kind of state. For example, how can you implement a shopping cart without saving information about items in between requests?

The solution to this is cookies, and built on top of this, sessions. We have a whole section addressing the sessions features in Yesod. However, I cannot stress enough that this should be used sparingly.

Let me give you an example. There's a popular bug tracking system that I deal with on a daily basis which horribly abuses sessions. There's a little drop-down on every page to select the current project. Seems harmless, right? What that dropdown does is set the current project in your session.

The result of all this is that clicking on the "view issues" link is entirely dependent on the last project you selected. There's no way to create a bookmark to your "Yesod" issues and a separate link for your "Hamlet" issues.

The proper RESTful approach to this is to have one resource for all of the Yesod issues and a separate one for all the Hamlet issues. In Yesod, this is easily done with a route definition like:

/ ProjectsR GET
/projects/#ProjectID ProjectIssuesR GET
/issues/#IssueID IssueR GET

Be nice to your users: proper stateless architecture means that basic features like bookmarks, permalinks and the back/forward button will always work.

Summary

Yesod adheres to the following tenets of REST:

  • Use the correct request method.

  • Each resource should have precisely one URL.

  • Allow multiple representations of data on the same URL.

  • Inspect request headers to determine extra information about what the client wants.

This makes it easy to use Yesod not just for building websites, but for building APIs. In fact, using techniques such as RepHtmlJson, you can serve both a user-friendly, HTML page and a machine-friendly, JSON page from the same URL.

Note: You are looking at version 1.1 of the book, which is three versions behind

Chapters