Persistent Progress

May 28, 2010

By Michael Snoyman

I just switched over a 2900-loc Yesod web application to use persistent, so I think now's a good time to give a bit of an update on the way things are going. For those interested, here's the most recent Haddocks.

Status

Amazingly, persistent is fully usable right now for certain functionalities. There's a high-level interface for defining tables, Template Haskell code to generate SQLite code, and the Persist typeclass. You can try it now if you like, though be warned that it will be changing.

Nothing is NULL

Pop quiz: does the following code return all the rows in a table?

SELECT * FROM MyTable WHERE my_column=5
UNION ALL
SELECT * FROM MyTable WHERE my_column<>5

It seems like the answer should be yes, but in fact, it's no. That's because both NULL=5 and NULL<>5 evaluate to FALSE. Technically, they evaluate to NULL, which gets interpretted in a boolean context as FALSE, but it's irrelevant to our point.

This makes it difficult to model a Maybe value as a nullable field. The solution isn't too difficult; if you want to filter for all rows where column my_column is equal to some Nothing value, you specify IS NULL instead of =? and binding the Nothing.

However, when dealing with unique keys and the getBy function, it gets a bit more complicated. So for now, I have a simple compromise: nullable columns cannot be included in unique keys.

Less type classes

I've decided to backtrack a bit on my multi-type-class idea in the previous post. Instead, I'm having the TH code generate a single type class instance for each table.

I've also decided to simplify the data families a bit. Here's the motiviation: I want to be able to have a PersonId datatype to go along with the Person table. The previous definition of Key was data family Key value m, where m is the monad for the backend we're dealing with (such as Sqlite).

Unfortunately, that means the definition of PersonId because type PersonId m = Key Person m, which is really irritating. So instead, Key is now simply data family Key value.

The upshot of this is that there can only be one instance for each value in scope at a time. However, this really makes perfect sense: the TH code generates the datatypes! So in order to switch backends, you'll need to change your TH deriving line.

YAML entity declarations

I'll be honest: I really don't like this solution, and find it ugly. However, it was the fastest thing to implement, and I needed something to work with immediately for my current project. Hopefully I can put some real thought into a better syntax later. In fact, I think an EDSL would be better here than quasi-quoted syntax.

Here's an example that would declare two datatypes and create appropriate instances for the Sqlite3 backend.

persistSqlite3 [$persist|
Author:
    columns:
        - name: name
          type: String
          order: [Asc] # can be sorted in ascending order
        - name: tagline
          type: String
          nullable: True
    uniques:
        UniqueAuthorName: [name]
Entry:
    columns:
        - name: author
          type: AuthorId # this is declared automatically
          filter: [Eq]   # you can select all entries by an author
        - name: date
          type: Day
          order: [Asc, Desc]
        - name: slug
          type: String
        - name: title
          type: String
        - name: content
          type: HtmlContent
    uniques:
        UniqueDateSlug: [date, slug]
|]

Using HDBC... for now

To get started quickly, I'm using HDBC. The obvious advantages are:

Automatic support for all HDBC backends.
No need to write against the FFI.
Well tested library.
SqlValue and Convertible are defined for us.

The downsides are:

Even with automatic support for the backends, we still need to be careful of writing compliant SQL. For example, INTEGER PRIMARY KEY versus INTEGER AUTO_INCREMENT versus SERIAL.
MySQL support is pretty spotty on HDBC if I'm not mistaken.
It still doesn't help us with the non-SQL backends at all.
We're probably incurring some some performance overhead.
HDBC isn't very good at deterministic operation. It's difficult to tell when a statement will be finalized, and if you use the lazy I/O, there's lots of opportunities for bugs.
It's not quite as bug-free as I'd like. Details available upon request.
Ideally I'd like a PersistValue instead of SqlValue datatype that takes into account non-SQL backends.
I don't like the design of Convertible: there's no distinction made between conversions which are guaranteed to succeed and those which might fail. This encourages errors ocurring in pure code. I proposed a different interface to John, and he seemed interested at the time, but nothing has come of it since.

I have a lot of experience with the Sqlite C API, and it's very easy to program against. I'm not so certain of MySQL and PostgreSQL, but frankly the hardest work in this project is the TH code, not the database code itself.

Of course, if we stick with HDBC it will be easier for people to drop down to straight SQL queries when persistent is too constraining.

Next steps

It would be really nice to be able to automatically generate web forms from this tables, especially on my current project. This isn't very difficult in theory, I just have to figure it out.

I'm not sure if for this first release I should focus more on a number of backends, or on perfecting a single one. If I do multiple backends, it might point out places where the API favors a certain backend too much, but I'd really like to get something working out there quickly so that Yesod can have a persistence layer already.

And I still have to choose an enumerator interface for the select, filter and order functions. For those afraid of enumerators, I intend to offer a strict list variant as well. I'm tempted to join everyone else with the iteratee package... I'd appreciate community feedback on this one.

If you want to get a better feel for this library, start by looking at the Persist typeclass. You can also look at a simple test program. Hopefully I'll put up a simple Yesod app that uses this soon.

Comments