Persistent

Forms deal with the boundary between the user and the application. Another boundary we need to deal with is between the application and the storage layer. Whether it be a SQL database, a YAML file, or a binary blob, odds are your storage layer does not natively understand your applications data types, and you’ll need to perform some marshaling. Persistent is Yesod’s answer to data storage- a type-safe, universal data store interface for Haskell.

Haskell has many different database bindings available. However, most of these have little knowledge of a schema and therefore do not provide useful static guarantees. They also force database-dependent APIs and data types on the programmer.

Some Haskellers have attempted a more revolutionary route: creating Haskell specific data stores that allow one to easily store any strongly typed Haskell data. These options are great for certain use cases, but they constrain one to the storage techniques provided by the library and do not interface well with other languages.

In contrast, Persistent allows us to choose among existing databases that are highly tuned for different data storage use cases, interoperate with other programming languages, and to use a safe and productive query interface, while still keeping the type safety of Haskell datatypes.

Persistent follows the guiding principles of type safety and concise, declarative syntax. Some other nice features are:

  • Database-agnostic. There is first class support for PostgreSQL, SQLite, MySQL and MongoDB, with experimental Redis support.

  • Convenient data modeling. Persistent lets you model relationships and use them in type-safe ways. The default type-safe persistent API does not support joins, allowing support for a wider number of storage layers. Joins and other SQL specific functionality can be achieved through using a raw SQL layer (with very little type safety). An additional library, Esqueleto, builds on top of the Persistent data model, adding type-safe joins and SQL functionality.

  • Automatically perform database migrations

Persistent works well with Yesod, but it is quite usable on its own as a standalone library. Most of this chapter will address Persistent on its own.

Synopsis

{-# LANGUAGE EmptyDataDecls             #-}
{-# LANGUAGE FlexibleContexts           #-}
{-# LANGUAGE GADTs                      #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses      #-}
{-# LANGUAGE OverloadedStrings          #-}
{-# LANGUAGE QuasiQuotes                #-}
{-# LANGUAGE TemplateHaskell            #-}
{-# LANGUAGE TypeFamilies               #-}
import           Control.Monad.IO.Class  (liftIO)
import           Database.Persist
import           Database.Persist.Sqlite
import           Database.Persist.TH

share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Person
    name String
    age Int Maybe
    deriving Show
BlogPost
    title String
    authorId PersonId
    deriving Show
|]

main :: IO ()
main = runSqlite ":memory:" $ do
    runMigration migrateAll

    johnId <- insert $ Person "John Doe" $ Just 35
    janeId <- insert $ Person "Jane Doe" Nothing

    insert $ BlogPost "My fr1st p0st" johnId
    insert $ BlogPost "One more for good measure" johnId

    oneJohnPost <- selectList [BlogPostAuthorId ==. johnId] [LimitTo 1]
    liftIO $ print (oneJohnPost :: [Entity BlogPost])

    john <- get johnId
    liftIO $ print (john :: Maybe Person)

    delete janeId
    deleteWhere [BlogPostAuthorId ==. johnId]

Solving the boundary issue

Suppose you are storing information on people in a SQL database. Your table might look something like:

CREATE TABLE person(id SERIAL PRIMARY KEY, name VARCHAR NOT NULL, age INTEGER)

And if you are using a database like PostgreSQL, you can be guaranteed that the database will never store some arbitrary text in your age field. (The same cannot be said of SQLite, but let’s forget about that for now.) To mirror this database table, you would likely create a Haskell datatype that looks something like:

data Person = Person
    { personName :: Text
    , personAge  :: Int
    }

It looks like everything is type safe: the database schema matches our Haskell datatypes, the database ensures that invalid data can never make it into our data store, and everything is generally awesome. Well, until:

  • You want to pull data from the database, and the database layer gives you the data in an untyped format.

  • You want to find everyone older than 32, and you accidentally write "thirtytwo" in your SQL statement. Guess what: that will compile just fine, and you won’t find out you have a problem until runtime.

  • You decide you want to find the first 10 people alphabetically. No problem… until you make a typo in your SQL. Once again, you don’t find out until runtime.

In dynamic languages, the answer to these issues is unit testing. For everything that can go wrong, make sure you write a test case. But as I am sure you are aware by now, that doesn’t jive well with the Yesod approach to things. We like to take advantage of Haskell’s strong typing to save us wherever possible, and data storage is no exception.

So the question remains: how can we use Haskell’s type system to save the day?

Types

Like routing, there is nothing intrinsically difficult about type-safe data access. It just requires a lot of monotonous, error prone, boiler plate code. As usual, this means we can use the type system to keep us honest. And to avoid some of the drudgery, we’ll use a sprinkling of Template Haskell.

PersistValue is the basic building block of Persistent. It is a sum type that can represent data that gets sent to and from a database. Its definition is:

data PersistValue = PersistText Text
                  | PersistByteString ByteString
                  | PersistInt64 Int64
                  | PersistDouble Double
                  | PersistRational Rational
                  | PersistBool Bool
                  | PersistDay Day
                  | PersistTimeOfDay TimeOfDay
                  | PersistUTCTime UTCTime
                  | PersistNull
                  | PersistList [PersistValue]
                  | PersistMap [(Text, PersistValue)]
                  | PersistObjectId ByteString -- ^ Intended especially for MongoDB backend
                  | PersistDbSpecific ByteString -- ^ Using 'PersistDbSpecific' allows you to use types specific to a particular backend

Each Persistent backend needs to know how to translate the relevant values into something the database can understand. However, it would be awkward to have to express all of our data simply in terms of these basic types. The next layer is the PersistField typeclass, which defines how an arbitrary Haskell datatype can be marshaled to and from a PersistValue. A PersistField correlates to a column in a SQL database. In our person example above, name and age would be our PersistFields.

To tie up the user side of the code, our last typeclass is PersistEntity. An instance of PersistEntity correlates with a table in a SQL database. This typeclass defines a number of functions and some associated types. To review, we have the following correspondence between Persistent and SQL:

SQL Persistent

Datatypes (VARCHAR, INTEGER, etc)

PersistValue

Column

PersistField

Table

PersistEntity

Code Generation

In order to ensure that the PersistEntity instances match up properly with your Haskell datatypes, Persistent takes responsibility for both. This is also good from a DRY (Don’t Repeat Yourself) perspective: you only need to define your entities once. Let’s see a quick example:

{-# LANGUAGE QuasiQuotes, TypeFamilies, GeneralizedNewtypeDeriving, TemplateHaskell, OverloadedStrings, GADTs #-}
import Database.Persist
import Database.Persist.TH
import Database.Persist.Sqlite
import Control.Monad.IO.Class (liftIO)

mkPersist sqlSettings [persistLowerCase|
Person
    name String
    age Int
    deriving Show
|]

We use a combination of Template Haskell and Quasi-Quotation (like when defining routes): persistLowerCase is a quasi-quoter which converts a whitespace-sensitive syntax into a list of entity definitions. "Lower case" refers to the format of the generated table names. In this scheme, an entity like SomeTable would become the SQL table some_table. You can also declare your entities in a separate file using persistFileWith. mkPersist takes that list of entities and declares:

  • One Haskell datatype for each entity.

  • A PersistEntity instance for each datatype defined.

The example above generates code that looks like the following:

{-# LANGUAGE TypeFamilies, GeneralizedNewtypeDeriving, OverloadedStrings, GADTs #-}
import Database.Persist
import Database.Persist.Sqlite
import Control.Monad.IO.Class (liftIO)
import Control.Applicative
import Data.Aeson (ToJSON, FromJSON)

data Person = Person
    { personName :: !String
    , personAge :: !Int
    }
  deriving (Show, Read, Eq)

type PersonId = Key Person

instance PersistEntity Person where
    newtype Key Person = PersonKey (BackendKey SqlBackend)
        deriving (ToJSON, FromJSON, PersistField, Show, Eq, Read, Ord)
    -- A Generalized Algebraic Datatype (GADT).
    -- This gives us a type-safe approach to matching fields with
    -- their datatypes.
    data EntityField Person typ where
        PersonId   :: EntityField Person PersonId
        PersonName :: EntityField Person String
        PersonAge  :: EntityField Person Int

    data Unique Person
    type PersistEntityBackend Person = SqlBackend

    toPersistFields (Person name age) =
        [ SomePersistField name
        , SomePersistField age
        ]

    fromPersistValues [nameValue, ageValue] = Person
        <$> fromPersistValue nameValue
        <*> fromPersistValue ageValue
    fromPersistValues _ = Left "Invalid fromPersistValues input"

    -- Information on each field, used internally to generate SQL statements
    persistFieldDef PersonId = FieldDef
        (HaskellName "Id")
        (DBName "id")
        (FTTypeCon Nothing "PersonId")
        SqlInt64
        []
        True
        NoReference
    persistFieldDef PersonName = FieldDef
        (HaskellName "name")
        (DBName "name")
        (FTTypeCon Nothing "String")
        SqlString
        []
        True
        NoReference
    persistFieldDef PersonAge = FieldDef
        (HaskellName "age")
        (DBName "age")
        (FTTypeCon Nothing "Int")
        SqlInt64
        []
        True
        NoReference

As you might expect, our Person datatype closely matches the definition we gave in the original Template Haskell version. We also have a Generalized Algebraic Datatype (GADT) which gives a separate constructor for each field. This GADT encodes both the type of the entity and the type of the field. We use its constructors throughout Persistent, such as to ensure that when we apply a filter, the types of the filtering value match the field. There’s another associated newtype for the database primary key of this entity.

We can use the generated Person type like any other Haskell type, and then pass it off to other Persistent functions.

{-# LANGUAGE EmptyDataDecls             #-}
{-# LANGUAGE FlexibleContexts           #-}
{-# LANGUAGE GADTs                      #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses      #-}
{-# LANGUAGE OverloadedStrings          #-}
{-# LANGUAGE QuasiQuotes                #-}
{-# LANGUAGE TemplateHaskell            #-}
{-# LANGUAGE TypeFamilies               #-}
import           Control.Monad.IO.Class  (liftIO)
import           Database.Persist
import           Database.Persist.Sqlite
import           Database.Persist.TH

share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Person
    name String
    age Int Maybe
    deriving Show
|]

main :: IO ()
main = runSqlite ":memory:" $ do
    michaelId <- insert $ Person "Michael" $ Just 26
    michael <- get michaelId
    liftIO $ print michael

We start off with some standard database connection code. In this case, we used the single-connection functions. Persistent also comes built in with connection pool functions, which we will generally want to use in production.

In this example, we have seen two functions: insert creates a new record in the database and returns its ID. Like everything else in Persistent, IDs are type safe. We’ll get into more details of how these IDs work later. So when you call insert $ Person "Michael" 26, it gives you a value back of type PersonId.

The next function we see is get, which attempts to load a value from the database using an Id. In Persistent, you never need to worry that you are using the key from the wrong table: trying to load up a different entity (like House) using a PersonId will never compile.

PersistStore

One last detail is left unexplained from the previous example: what exactly does runSqlite do, and what is that monad that our database actions are running in?

All database actions require a parameter which is an instance of PersistStore. As its name implies, every data store (PostgreSQL, SQLite, MongoDB) has an instance of PersistStore. This is where all the translations from PersistValue to database-specific values occur, where SQL query generation happens, and so on.

runSqlite creates a single connection to a database using its supplied connection string. For our test cases, we will use :memory:, which uses an in-memory database. All of the SQL backends share the same instance of PersistStore: SqlBackend. runSqlite then provides the SqlBackend value as an environment parameter to the action via runReaderT.

One important thing to note is that everything which occurs inside a single call to runSqlite runs in a single transaction. This has two important implications:

  • For many databases, committing a transaction can be a costly activity. By putting multiple steps into a single transaction, you can speed up code dramatically.

  • If an exception is thrown anywhere inside a single call to runSqlite, all actions will be rolled back (assuming your backend has rollback support).

Migrations

I’m sorry to tell you, but so far I have lied to you a bit: the example from the previous section does not actually work. If you try to run it, you will get an error message about a missing table.

For SQL databases, one of the major pains can be managing schema changes. Instead of leaving this to the user, Persistent steps in to help, but you have to ask it to help. Let’s see what this looks like:

{-# LANGUAGE EmptyDataDecls             #-}
{-# LANGUAGE FlexibleContexts           #-}
{-# LANGUAGE GADTs                      #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses      #-}
{-# LANGUAGE OverloadedStrings          #-}
{-# LANGUAGE QuasiQuotes                #-}
{-# LANGUAGE TemplateHaskell            #-}
{-# LANGUAGE TypeFamilies               #-}
import Database.Persist
import Database.Persist.TH
import Database.Persist.Sqlite
import Control.Monad.IO.Class (liftIO)

share [mkPersist sqlSettings, mkSave "entityDefs"] [persistLowerCase|
Person
    name String
    age Int
    deriving Show
|]

main :: IO ()
main = runSqlite ":memory:" $ do
    -- this line added: that's it!
    runMigration $ migrate entityDefs $ entityDef (Nothing :: Maybe Person)
    michaelId <- insert $ Person "Michael" 26
    michael <- get michaelId
    liftIO $ print michael

With this one little code change, Persistent will automatically create your Person table for you. This split between runMigration and migrate allows you to migrate multiple tables simultaneously.

This works when dealing with just a few entities, but can quickly get tiresome once we are dealing with a dozen entities. Instead of repeating yourself, Persistent provides a helper function, mkMigrate:

{-# LANGUAGE EmptyDataDecls             #-}
{-# LANGUAGE FlexibleContexts           #-}
{-# LANGUAGE GADTs                      #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses      #-}
{-# LANGUAGE OverloadedStrings          #-}
{-# LANGUAGE QuasiQuotes                #-}
{-# LANGUAGE TemplateHaskell            #-}
{-# LANGUAGE TypeFamilies               #-}
import Database.Persist
import Database.Persist.Sqlite
import Database.Persist.TH

share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Person
    name String
    age Int
    deriving Show
Car
    color String
    make String
    model String
    deriving Show
|]

main :: IO ()
main = runSqlite ":memory:" $ do runMigration migrateAll

mkMigrate is a Template Haskell function which creates a new function that will automatically call migrate on all entities defined in the persist block. The share function is just a little helper that passes the information from the persist block to each Template Haskell function and concatenates the results.

Persistent has very conservative rules about what it will do during a migration. It starts by loading up table information from the database, complete with all defined SQL datatypes. It then compares that against the entity definition given in the code. For the following cases, it will automatically alter the schema:

  • The datatype of a field changed. However, the database may object to this modification if the data cannot be translated.

  • A field was added. However, if the field is not null, no default value is supplied (we’ll discuss defaults later) and there is already data in the database, the database will not allow this to happen.

  • A field is converted from not null to null. In the opposite case, Persistent will attempt the conversion, contingent upon the database’s approval.

  • A brand new entity is added.

However, there are some cases that Persistent will not handle:

  • Field or entity renames: Persistent has no way of knowing that "name" has now been renamed to "fullName": all it sees is an old field called name and a new field called fullName.

  • Field removals: since this can result in data loss, Persistent by default will refuse to perform the action (you can force the issue by using runMigrationUnsafe instead of runMigration, though it is not recommended).

runMigration will print out the migrations it is running on stderr (you can bypass this by using runMigrationSilent). Whenever possible, it uses ALTER TABLE calls. However, in SQLite, ALTER TABLE has very limited abilities, and therefore Persistent must resort to copying the data from one table to another.

Finally, if instead of performing a migration, you want Persistent to give you hints about what migrations are necessary, use the printMigration function. This function will print out the migrations which runMigration would perform for you. This may be useful for performing migrations that Persistent is not capable of, for adding arbitrary SQL to a migration, or just to log what migrations occurred.

Uniqueness

In addition to declaring fields within an entity, you can also declare uniqueness constraints. A typical example would be requiring that a username be unique.

User
    username Text
    UniqueUsername username

While each field name must begin with a lowercase letter, the uniqueness constraints must begin with an uppercase letter, since it will be represented in Haskell as a data constructor.

{-# LANGUAGE EmptyDataDecls             #-}
{-# LANGUAGE FlexibleContexts           #-}
{-# LANGUAGE GADTs                      #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses      #-}
{-# LANGUAGE OverloadedStrings          #-}
{-# LANGUAGE QuasiQuotes                #-}
{-# LANGUAGE TemplateHaskell            #-}
{-# LANGUAGE TypeFamilies               #-}
import Database.Persist
import Database.Persist.Sqlite
import Database.Persist.TH
import Data.Time
import Control.Monad.IO.Class (liftIO)

share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Person
    firstName String
    lastName String
    age Int
    PersonName firstName lastName
    deriving Show
|]

main :: IO ()
main = runSqlite ":memory:" $ do
    runMigration migrateAll
    insert $ Person "Michael" "Snoyman" 26
    michael <- getBy $ PersonName "Michael" "Snoyman"
    liftIO $ print michael

To declare a unique combination of fields, we add an extra line to our declaration. Persistent knows that it is defining a unique constructor, since the line begins with a capital letter. Each following word must be a field in this entity.

The main restriction on uniqueness is that it can only be applied non-null fields. The reason for this is that the SQL standard is ambiguous on how uniqueness should be applied to NULL (e.g., is NULL=NULL true or false?). Besides that ambiguity, most SQL engines in fact implement rules which would be contrary to what the Haskell datatypes anticipate (e.g., PostgreSQL says that NULL=NULL is false, whereas Haskell says Nothing == Nothing is True).

In addition to providing nice guarantees at the database level about consistency of your data, uniqueness constraints can also be used to perform some specific queries within your Haskell code, like the getBy demonstrated above. This happens via the Unique associated type. In the example above, we end up with a new constructor:

PersonName :: String -> String -> Unique Person

Queries

Depending on what your goal is, there are different approaches to querying the database. Some commands query based on a numeric ID, while others will filter. Queries also differ in the number of results they return: some lookups should return no more than one result (if the lookup key is unique) while others can return many results.

Persistent therefore provides a few different query functions. As usual, we try to encode as many invariants in the types as possible. For example, a query that can return only 0 or 1 results will use a Maybe wrapper, whereas a query returning many results will return a list.

Fetching by ID

The simplest query you can perform in Persistent is getting based on an ID. Since this value may or may not exist, its return type is wrapped in a Maybe.

personId <- insert $ Person "Michael" "Snoyman" 26
maybePerson <- get personId
case maybePerson of
    Nothing -> liftIO $ putStrLn "Just kidding, not really there"
    Just person -> liftIO $ print person

This can be very useful for sites that provide URLs like /person/5. However, in such a case, we don’t usually care about the Maybe wrapper, and just want the value, returning a 404 message if it is not found. Fortunately, the get404 (provided by the yesod-persistent package) function helps us out here. We’ll go into more details when we see integration with Yesod.

Fetching by unique constraint

getBy is almost identical to get, except:

  1. it takes a uniqueness constraint; that is, instead of an ID it takes a Unique value.

  2. it returns an Entity instead of a value. An Entity is a combination of database ID and value.

personId <- insert $ Person "Michael" "Snoyman" 26
maybePerson <- getBy $ UniqueName "Michael" "Snoyman"
case maybePerson of
    Nothing -> liftIO $ putStrLn "Just kidding, not really there"
    Just (Entity personId person) -> liftIO $ print person

Like get404, there is also a getBy404 function.

Select functions

Most likely, you’re going to want more powerful queries. You’ll want to find everyone over a certain age; all cars available in blue; all users without a registered email address. For this, you need one of the select functions.

All the select functions use a similar interface, with slightly different outputs:

Function Returns

selectSource

A Source containing all the IDs and values from the database. This allows you to write streaming code.

NOTE: A Source is a stream of data, and is part of the conduit package. I recommend reading the School of Haskell conduit tutorial to get started.

selectList

A list containing all the IDs and values from the database. All records will be loaded into memory.

selectFirst

Takes just the first ID and value from the database, if available

selectKeys

Returns only the keys, without the values, as a Source.

selectList is the most commonly used, so we will cover it specifically. Understanding the others should be trivial after that.

selectList takes two arguments: a list of Filters, and a list of SelectOpts. The former is what limits your results based on characteristics; it allows for equals, less than, is member of, and such. SelectOpts provides for three different features: sorting, limiting output to a certain number of rows, and offsetting results by a certain number of rows.

Let’s jump straight into an example of filtering, and then analyze it.

people <- selectList [PersonAge >. 25, PersonAge <=. 30] []
liftIO $ print people

As simple as that example is, we really need to cover three points:

  1. PersonAge is a constructor for an associated phantom type. That might sound scary, but what’s important is that it uniquely identifies the "age" column of the "person" table, and that it knows that the age field is an Int. (That’s the phantom part.)

  2. We have a bunch of Persistent filtering operators. They’re all pretty straight-forward: just tack a period to the end of what you’d expect. There are three gotchas here, I’ll explain below.

  3. The list of filters is ANDed together, so that our constraint means "age is greater than 25 AND age is less than or equal to 30". We’ll describe ORing later.

The one operator that’s surprisingly named is "not equals." We use !=., since /=. is used for updates (for "divide-and-set", described later). Don’t worry: if you use the wrong one, the compiler will catch you. The other two surprising operators are the "is member" and "is not member". They are, respectively, ←. and /←. (both end with a period).

And regarding ORs, we use the ||. operator. For example:

people <- selectList
    (       [PersonAge >. 25, PersonAge <=. 30]
        ||. [PersonFirstName /<-. ["Adam", "Bonny"]]
        ||. ([PersonAge ==. 50] ||. [PersonAge ==. 60])
    )
    []
liftIO $ print people

This (completely nonsensical) example means: find people who are 26-30, inclusive, OR whose names are neither Adam or Bonny, OR whose age is either 50 or 60.

SelectOpt

All of our selectList calls have included an empty list as the second parameter. That specifies no options, meaning: sort however the database wants, return all results, and don’t skip any results. A SelectOpt has four constructors that can be used to change all that.

Asc

Sort by the given column in ascending order. This uses the same phantom type as filtering, such as PersonAge.

Desc

Same as Asc, in descending order.

LimitTo

Takes an Int argument. Only return up to the specified number of results.

OffsetBy

Takes an Int argument. Skip the specified number of results.

The following code defines a function that will break down results into pages. It returns all people aged 18 and over, and then sorts them by age (oldest person first). For people with the same age, they are sorted alphabetically by last name, then first name.

resultsForPage pageNumber = do
    let resultsPerPage = 10
    selectList
        [ PersonAge >=. 18
        ]
        [ Desc PersonAge
        , Asc PersonLastName
        , Asc PersonFirstName
        , LimitTo resultsPerPage
        , OffsetBy $ (pageNumber - 1) * resultsPerPage
        ]

Manipulation

Querying is only half the battle. We also need to be able to add data to and modify existing data in the database.

Insert

It’s all well and good to be able to play with data in the database, but how does it get there in the first place? The answer is the insert function. You just give it a value, and it gives back an ID.

At this point, it makes sense to explain a bit of the philosophy behind Persistent. In many other ORM solutions, the datatypes used to hold data are opaque: you need to go through their defined interfaces to get at and modify the data. That’s not the case with Persistent: we’re using plain old Algebraic Data Types for the whole thing. This means you still get all the great benefits of pattern matching, currying and everything else you’re used to.

However, there are a few things we can’t do. For one, there’s no way to automatically update values in the database every time the record is updated in Haskell. Of course, with Haskell’s normal stance of purity and immutability, this wouldn’t make much sense anyway, so I don’t shed any tears over it.

However, there is one issue that newcomers are often bothered by: why are IDs and values completely separate? It seems like it would be very logical to embed the ID inside the value. In other words, instead of having:

data Person = Person { name :: String }

have

data Person = Person { personId :: PersonId, name :: String }

Well, there’s one problem with this right off the bat: how do we do an insert? If a Person needs to have an ID, and we get the ID by inserting, and an insert needs a Person, we have an impossible loop. We could solve this with undefined, but that’s just asking for trouble.

OK, you say, let’s try something a bit safer:

data Person = Person { personId :: Maybe PersonId, name :: String }

I definitely prefer insert $ Person Nothing "Michael" to insert $ Person undefined "Michael". And now our types will be much simpler, right? For example, selectList could return a simple [Person] instead of that ugly [Entity SqlPersist Person].

The problem is that the "ugliness" is incredibly useful. Having Entity Person makes it obvious, at the type level, that we’re dealing with a value that exists in the database. Let’s say we want to create a link to another page that requires the PersonId (not an uncommon occurrence as we’ll discuss later). The Entity Person form gives us unambiguous access to that information; embedding PersonId within Person with a Maybe wrapper means an extra runtime check for Just, instead of a more error-proof compile time check.

Finally, there’s a semantic mismatch with embedding the ID within the value. The Person is the value. Two people are identical (in the context of Haskell) if all their fields are the same. By embedding the ID in the value, we’re no longer talking about a person, but about a row in the database. Equality is no longer really equality, it’s identity: is this the same person, as opposed to an equivalent person.

In other words, there are some annoyances with having the ID separated out, but overall, it’s the right approach, which in the grand scheme of things leads to better, less buggy code.

Update

Now, in the context of that discussion, let’s think about updating. The simplest way to update is:

let michael = Person "Michael" 26
    michaelAfterBirthday = michael { personAge = 27 }

But that’s not actually updating anything, it’s just creating a new Person value based on the old one. When we say update, we’re not talking about modifications to the values in Haskell. (We better not be of course, since data in Haskell is immutable.)

Instead, we’re looking at ways of modifying rows in a table. And the simplest way to do that is with the update function.

personId <- insert $ Person "Michael" "Snoyman" 26
update personId [PersonAge =. 27]

update takes two arguments: an ID, and a list of Updates. The simplest update is assignment, but it’s not always the best. What if you want to increase someone’s age by 1, but you don’t have their current age? Persistent has you covered:

haveBirthday personId = update personId [PersonAge +=. 1]

And as you might expect, we have all the basic mathematical operators: +=., -=., \*=., and /=. (full stop). These can be convenient for updating a single record, but they are also essential for proper ACID guarantees. Imagine the alternative: pull out a Person, increment the age, and update the new value. If you have two threads/processes working on this database at the same time, you’re in for a world of hurt (hint: race conditions).

Sometimes you’ll want to update many rows at once (give all your employees a 5% pay increase, for example). updateWhere takes two parameters: a list of filters, and a list of updates to apply.

updateWhere [PersonFirstName ==. "Michael"] [PersonAge *=. 2] -- it's been a long day

Occasionally, you’ll just want to completely replace the value in a database with a different value. For that, you use (surprise) the replace function.

personId <- insert $ Person "Michael" "Snoyman" 26
replace personId $ Person "John" "Doe" 20

Delete

As much as it pains us, sometimes we must part with our data. To do so, we have three functions:

delete

Delete based on an ID

deleteBy

Delete based on a unique constraint

deleteWhere

Delete based on a set of filters

personId <- insert $ Person "Michael" "Snoyman" 26
delete personId
deleteBy $ UniqueName "Michael" "Snoyman"
deleteWhere [PersonFirstName ==. "Michael"]

We can even use deleteWhere to wipe out all the records in a table, we just need to give some hints to GHC as to what table we’re interested in:

    deleteWhere ([] :: [Filter Person])

Attributes

So far, we have seen a basic syntax for our persistLowerCase blocks: a line for the name of our entities, and then an indented line for each field with two words: the name of the field and the datatype of the field. Persistent handles more than this: you can assign an arbitrary list of attributes after the first two words on a line.

Suppose we want to have a Person entity with an (optional) age and the timestamp of when he/she was added to the system. For entities already in the database, we want to just use the current date-time for that timestamp.

{-# LANGUAGE EmptyDataDecls             #-}
{-# LANGUAGE FlexibleContexts           #-}
{-# LANGUAGE GADTs                      #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses      #-}
{-# LANGUAGE OverloadedStrings          #-}
{-# LANGUAGE QuasiQuotes                #-}
{-# LANGUAGE TemplateHaskell            #-}
{-# LANGUAGE TypeFamilies               #-}
import Database.Persist
import Database.Persist.Sqlite
import Database.Persist.TH
import Data.Time
import Control.Monad.IO.Class

share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Person
    name String
    age Int Maybe
    created UTCTime default=CURRENT_TIME
    deriving Show
|]

main :: IO ()
main = runSqlite ":memory:" $ do
    time <- liftIO getCurrentTime
    runMigration migrateAll
    insert $ Person "Michael" (Just 26) time
    insert $ Person "Greg" Nothing time
    return ()

Maybe is a built in, single word attribute. It makes the field optional. In Haskell, this means it is wrapped in a Maybe. In SQL, it makes the column nullable.

The default attribute is backend specific, and uses whatever syntax is understood by the database. In this case, it uses the database’s built-in CURRENT_TIME function. Suppose that we now want to add a field for a person’s favorite programming language:

{-# LANGUAGE EmptyDataDecls             #-}
{-# LANGUAGE FlexibleContexts           #-}
{-# LANGUAGE GADTs                      #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses      #-}
{-# LANGUAGE OverloadedStrings          #-}
{-# LANGUAGE QuasiQuotes                #-}
{-# LANGUAGE TemplateHaskell            #-}
{-# LANGUAGE TypeFamilies               #-}
import Database.Persist
import Database.Persist.Sqlite
import Database.Persist.TH
import Data.Time

share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Person
    name String
    age Int Maybe
    created UTCTime default=CURRENT_TIME
    language String default='Haskell'
    deriving Show
|]

main :: IO ()
main = runSqlite ":memory:" $ do
    runMigration migrateAll

We need to surround the string with single quotes so that the database can properly interpret it. Finally, Persistent can use double quotes for containing white space, so if we want to set someone’s default home country to be El Salvador:

{-# LANGUAGE EmptyDataDecls             #-}
{-# LANGUAGE FlexibleContexts           #-}
{-# LANGUAGE GADTs                      #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses      #-}
{-# LANGUAGE OverloadedStrings          #-}
{-# LANGUAGE QuasiQuotes                #-}
{-# LANGUAGE TemplateHaskell            #-}
{-# LANGUAGE TypeFamilies               #-}
import Database.Persist
import Database.Persist.Sqlite
import Database.Persist.TH
import Data.Time

share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Person
    name String
    age Int Maybe
    created UTCTime default=CURRENT_TIME
    language String default='Haskell'
    country String "default='El Salvador'"
    deriving Show
|]

main :: IO ()
main = runSqlite ":memory:" $ do
    runMigration migrateAll

One last trick you can do with attributes is to specify the names to be used for the SQL tables and columns. This can be convenient when interacting with existing databases.

share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Person sql=the-person-table id=numeric_id
    firstName String sql=first_name
    lastName String sql=fldLastName
    age Int "sql=The Age of the Person"
    UniqueName firstName lastName
    deriving Show
|]

There are a number of other features to the entity definition syntax. An up-to-date list is maintained on the Yesod wiki.

Relations

Persistent allows references between your data types in a manner that is consistent with supporting non-SQL databases. We do this by embedding an ID in the related entity. So if a person has many cars:

{-# LANGUAGE EmptyDataDecls             #-}
{-# LANGUAGE FlexibleContexts           #-}
{-# LANGUAGE GADTs                      #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses      #-}
{-# LANGUAGE OverloadedStrings          #-}
{-# LANGUAGE QuasiQuotes                #-}
{-# LANGUAGE TemplateHaskell            #-}
{-# LANGUAGE TypeFamilies               #-}
import Database.Persist
import Database.Persist.Sqlite
import Database.Persist.TH
import Control.Monad.IO.Class (liftIO)
import Data.Time

share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Person
    name String
    deriving Show
Car
    ownerId PersonId
    name String
    deriving Show
|]

main :: IO ()
main = runSqlite ":memory:" $ do
    runMigration migrateAll
    bruce <- insert $ Person "Bruce Wayne"
    insert $ Car bruce "Bat Mobile"
    insert $ Car bruce "Porsche"
    -- this could go on a while
    cars <- selectList [CarOwnerId ==. bruce] []
    liftIO $ print cars

Using this technique, you can define one-to-many relationships. To define many-to-many relationships, we need a join entity, which has a one-to-many relationship with each of the original tables. It is also a good idea to use uniqueness constraints on these. For example, to model a situation where we want to track which people have shopped in which stores:

{-# LANGUAGE EmptyDataDecls             #-}
{-# LANGUAGE FlexibleContexts           #-}
{-# LANGUAGE GADTs                      #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses      #-}
{-# LANGUAGE OverloadedStrings          #-}
{-# LANGUAGE QuasiQuotes                #-}
{-# LANGUAGE TemplateHaskell            #-}
{-# LANGUAGE TypeFamilies               #-}
import Database.Persist
import Database.Persist.Sqlite
import Database.Persist.TH
import Data.Time

share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Person
    name String
Store
    name String
PersonStore
    personId PersonId
    storeId StoreId
    UniquePersonStore personId storeId
|]

main :: IO ()
main = runSqlite ":memory:" $ do
    runMigration migrateAll

    bruce <- insert $ Person "Bruce Wayne"
    michael <- insert $ Person "Michael"

    target <- insert $ Store "Target"
    gucci <- insert $ Store "Gucci"
    sevenEleven <- insert $ Store "7-11"

    insert $ PersonStore bruce gucci
    insert $ PersonStore bruce sevenEleven

    insert $ PersonStore michael target
    insert $ PersonStore michael sevenEleven

    return ()

Closer look at types

So far, we’ve spoken about Person and PersonId without really explaining what they are. In the simplest sense, for a SQL-only system, the PersonId could just be type PersonId = Int64. However, that means there is nothing binding a PersonId at the type level to the Person entity. As a result, you could accidentally use a PersonId and get a Car. In order to model this relationship, we could use phantom types. So, our next naive step would be:

newtype Key entity = Key Int64
type PersonId = Key Person

And that works out really well, until you get to a backend that doesn’t use Int64 for its IDs. And that’s not just a theoretical question; MongoDB uses ByteStrings instead. So what we need is a key value that can contain an Int and a ByteString. Seems like a great time for a sum type:

data Key entity = KeyInt Int64 | KeyByteString ByteString

But that’s just asking for trouble. Next we’ll have a backend that uses timestamps, so we’ll need to add another constructor to Key. This could go on for a while. Fortunately, we already have a sum type intended for representing arbitrary data: PersistValue:

newtype Key entity = Key PersistValue

And this is (more or less) what Persistent did until version 2.0. However, this has a different problem: it throws away data. For example, when dealing with a SQL database, we know that the key type will be an Int64 (assuming defaults are being used). However, you can’t assert that at the type level with this construction. So instead, starting with Persistent 2.0, we now use an associated datatype inside the PersistEntity class:

class PersistEntity record where
    data Key record
    ...

When you’re working with a SQL backend, and aren’t using a custom key type, this becomes a newtype wrapper around an Int64, and the toSqlKey/fromSqlKey functions can perform that type-safe conversion for you. With MongoDB, on the other hand, it’s a wrapper around a ByteString.

More complicated, more generic

By default, Persistent will hard-code your datatypes to work with a specific database backend. When using sqlSettings, this is the SqlBackend type. But if you want to write Persistent code that can be used on multiple backends, you can enable more generic types by replacing sqlSettings with sqlSettings { mpsGeneric = True }.

To understand why this is necessary, consider relations. Let’s say we want to represent blogs and blog posts. We would use the entity definition:

Blog
    title Text
Post
    title Text
    blogId BlogId

We know that BlogId is just a type synonym for Key Blog, but how will Key Blog be defined? We can’t use an Int64, since that won’t work for MongoDB. And we can’t use ByteString, since that won’t work for SQL databases.

To allow for this, once mpsGeneric is set to True, out resulting datatypes have a type parameter to indicate the database backend they use, so that keys can be properly encoded. This looks like:

data BlogGeneric backend = Blog { blogTitle :: Text }
data PostGeneric backend = Post
    { postTitle  :: Text
    , postBlogId :: Key (BlogGeneric backend)
    }

Notice that we still keep the short names for the constructors and the records. Finally, to give a simple interface for normal code, we define some type synonyms:

type Blog   = BlogGeneric SqlBackend
type BlogId = Key Blog
type Post   = PostGeneric SqlBackend
type PostId = Key Post

And no, SqlBackend isn’t hard-coded into Persistent anywhere. That sqlSettings parameter you’ve been passing to mkPersist is what tells us to use SqlBackend. Mongo code will use mongoSettings instead.

This might be quite complicated under the surface, but user code hardly ever touches this. Look back through this whole chapter: not once did we need to deal with the Key or Generic stuff directly. The most common place for it to pop up is in compiler error messages. So it’s important to be aware that this exists, but it shouldn’t affect you on a day-to-day basis.

Custom Fields

Occasionally, you will want to define a custom field to be used in your datastore. The most common case is an enumeration, such as employment status. For this, Persistent provides a helper Template Haskell function:

-- @Employment.hs
{-# LANGUAGE TemplateHaskell #-}
module Employment where

import Database.Persist.TH

data Employment = Employed | Unemployed | Retired
    deriving (Show, Read, Eq)
derivePersistField "Employment"
{-# LANGUAGE EmptyDataDecls             #-}
{-# LANGUAGE FlexibleContexts           #-}
{-# LANGUAGE GADTs                      #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses      #-}
{-# LANGUAGE OverloadedStrings          #-}
{-# LANGUAGE QuasiQuotes                #-}
{-# LANGUAGE TemplateHaskell            #-}
{-# LANGUAGE TypeFamilies               #-}
import Database.Persist.Sqlite
import Database.Persist.TH
import Employment

share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Person
    name String
    employment Employment
|]

main :: IO ()
main = runSqlite ":memory:" $ do
    runMigration migrateAll

    insert $ Person "Bruce Wayne" Retired
    insert $ Person "Peter Parker" Unemployed
    insert $ Person "Michael" Employed

    return ()

derivePersistField stores the data in the database using a string field, and performs marshaling using the Show and Read instances of the datatype. This may not be as efficient as storing via an integer, but it is much more future proof: even if you add extra constructors in the future, your data will still be valid.

Persistent: Raw SQL

The Persistent package provides a type safe interface to data stores. It tries to be backend-agnostic, such as not relying on relational features of SQL. My experience has been you can easily perform 95% of what you need to do with the high-level interface. (In fact, most of my web apps use the high level interface exclusively.)

But occassionally you’ll want to use a feature that’s specific to a backend. One feature I’ve used in the past is full text search. In this case, we’ll use the SQL "LIKE" operator, which is not modeled in Persistent. We’ll get all people with the last name "Snoyman" and print the records out.

{-# LANGUAGE EmptyDataDecls             #-}
{-# LANGUAGE FlexibleContexts           #-}
{-# LANGUAGE GADTs                      #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses      #-}
{-# LANGUAGE OverloadedStrings          #-}
{-# LANGUAGE QuasiQuotes                #-}
{-# LANGUAGE TemplateHaskell            #-}
{-# LANGUAGE TypeFamilies               #-}
import Database.Persist.TH
import Data.Text (Text)
import Database.Persist.Sqlite
import Control.Monad.IO.Class (liftIO)
import Data.Conduit
import qualified Data.Conduit.List as CL

share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Person
    name Text
|]

main :: IO ()
main = runSqlite ":memory:" $ do
    runMigration migrateAll
    insert $ Person "Michael Snoyman"
    insert $ Person "Miriam Snoyman"
    insert $ Person "Eliezer Snoyman"
    insert $ Person "Gavriella Snoyman"
    insert $ Person "Greg Weber"
    insert $ Person "Rick Richardson"

    -- Persistent does not provide the LIKE keyword, but we'd like to get the
    -- whole Snoyman family...
    let sql = "SELECT name FROM Person WHERE name LIKE '%Snoyman'"
    rawQuery sql [] $$ CL.mapM_ (liftIO . print)

There is also higher-level support that allows for automated data marshaling. Please see the Haddock API docs for more details.

Integration with Yesod

So you’ve been convinced of the power of Persistent. How do you integrate it with your Yesod application? If you use the scaffolding, most of the work is done for you already. But as we normally do, we’ll build up everything manually here to point out how it works under the surface.

The yesod-persistent package provides the meeting point between Persistent and Yesod. It provides the YesodPersist typeclass, which standardizes access to the database via the runDB method. Let’s see this in action.

{-# LANGUAGE EmptyDataDecls             #-}
{-# LANGUAGE FlexibleContexts           #-}
{-# LANGUAGE GADTs                      #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses      #-}
{-# LANGUAGE OverloadedStrings          #-}
{-# LANGUAGE QuasiQuotes                #-}
{-# LANGUAGE TemplateHaskell            #-}
{-# LANGUAGE TypeFamilies               #-}
{-# LANGUAGE ViewPatterns               #-}
import Yesod
import Database.Persist.Sqlite
import Control.Monad.Trans.Resource (runResourceT)
import Control.Monad.Logger (runStderrLoggingT)

-- Define our entities as usual
share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Person
    firstName String
    lastName String
    age Int
    deriving Show
|]

-- We keep our connection pool in the foundation. At program initialization, we
-- create our initial pool, and each time we need to perform an action we check
-- out a single connection from the pool.
data PersistTest = PersistTest ConnectionPool

-- We'll create a single route, to access a person. It's a very common
-- occurrence to use an Id type in routes.
mkYesod "PersistTest" [parseRoutes|
/ HomeR GET
/person/#PersonId PersonR GET
|]

-- Nothing special here
instance Yesod PersistTest

-- Now we need to define a YesodPersist instance, which will keep track of
-- which backend we're using and how to run an action.
instance YesodPersist PersistTest where
    type YesodPersistBackend PersistTest = SqlBackend

    runDB action = do
        PersistTest pool <- getYesod
        runSqlPool action pool

-- List all people in the database
getHomeR :: Handler Html
getHomeR = do
    people <- runDB $ selectList [] [Asc PersonAge]
    defaultLayout
        [whamlet|
            <ul>
                $forall Entity personid person <- people
                    <li>
                        <a href=@{PersonR personid}>#{personFirstName person}
        |]

-- We'll just return the show value of a person, or a 404 if the Person doesn't
-- exist.
getPersonR :: PersonId -> Handler String
getPersonR personId = do
    person <- runDB $ get404 personId
    return $ show person

openConnectionCount :: Int
openConnectionCount = 10

main :: IO ()
main = runStderrLoggingT $ withSqlitePool "test.db3" openConnectionCount $ \pool -> liftIO $ do
    runResourceT $ flip runSqlPool pool $ do
        runMigration migrateAll
        insert $ Person "Michael" "Snoyman" 26
    warp 3000 $ PersistTest pool

There are two important pieces here for general use. runDB is used to run a DB action from within a Handler. Within the runDB, you can use any of the functions we’ve spoken about so far, such as insert and selectList.

The other new feature is get404. It works just like get, but instead of returning a Nothing when a result can’t be found, it returns a 404 message page. The getPersonR function is a very common approach used in real-world Yesod applications: get404 a value and then return a response based on it.

More complex SQL

Persistent strives to be backend-agnostic. The advantage of this approach is code which easily moves from different backend types. The downside is that you lose out on some backend-specific features. Probably the biggest casualty is SQL join support.

Fortunately, thanks to Felipe Lessa, you can have your cake and eat it too. The Esqueleto library provides support for writing type safe SQL queries, using the existing Persistent infrastructure. The Haddocks for that package provide a good introduction to its usage. And since it uses many Persistent concepts, most of your existing Persistent knowledge should transfer over easily.

For a simple example of using Esqueleto, please see the SQL Joins chapter.

Something besides SQLite

To keep the examples in this chapter simple, we’ve used the SQLite backend. Just to round things out, here’s our original synopsis rewritten to work with PostgreSQL:

{-# LANGUAGE EmptyDataDecls             #-}
{-# LANGUAGE FlexibleContexts           #-}
{-# LANGUAGE GADTs                      #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE MultiParamTypeClasses      #-}
{-# LANGUAGE OverloadedStrings          #-}
{-# LANGUAGE QuasiQuotes                #-}
{-# LANGUAGE TemplateHaskell            #-}
{-# LANGUAGE TypeFamilies               #-}
import           Control.Monad.IO.Class  (liftIO)
import           Control.Monad.Logger    (runStderrLoggingT)
import           Database.Persist
import           Database.Persist.Postgresql
import           Database.Persist.TH

share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persistLowerCase|
Person
    name String
    age Int Maybe
    deriving Show
BlogPost
    title String
    authorId PersonId
    deriving Show
|]

connStr = "host=localhost dbname=test user=test password=test port=5432"

main :: IO ()
main = runStderrLoggingT $ withPostgresqlPool connStr 10 $ \pool -> liftIO $ do
    flip runSqlPersistMPool pool $ do
        runMigration migrateAll

        johnId <- insert $ Person "John Doe" $ Just 35
        janeId <- insert $ Person "Jane Doe" Nothing

        insert $ BlogPost "My fr1st p0st" johnId
        insert $ BlogPost "One more for good measure" johnId

        oneJohnPost <- selectList [BlogPostAuthorId ==. johnId] [LimitTo 1]
        liftIO $ print (oneJohnPost :: [Entity BlogPost])

        john <- get johnId
        liftIO $ print (john :: Maybe Person)

        delete janeId
        deleteWhere [BlogPostAuthorId ==. johnId]

Summary

Persistent brings the type safety of Haskell to your data access layer. Instead of writing error-prone, untyped data access, or manually writing boilerplate marshal code, you can rely on Persistent to automate the process for you.

The goal is to provide everything you need, most of the time. For the times when you need something a bit more powerful, Persistent gives you direct access to the underlying data store, so you can write whatever 5-way joins you want.

Persistent integrates directly into the general Yesod workflow. Not only do helper packages like yesod-persistent provide a nice layer, but packages like yesod-form and yesod-auth also leverage Persistent’s features as well.

Chapters