May 28, 2010
I just switched over a 2900-loc Yesod web application to use persistent, so I think now's a good time to give a bit of an update on the way things are going. For those interested, here's the most recent Haddocks.
Amazingly, persistent is fully usable right now for certain functionalities. There's a high-level interface for defining tables, Template Haskell code to generate SQLite code, and the Persist typeclass. You can try it now if you like, though be warned that it will be changing.
Nothing is NULL
Pop quiz: does the following code return all the rows in a table?
SELECT * FROM MyTable WHERE my_column=5 UNION ALL SELECT * FROM MyTable WHERE my_column<>5
It seems like the answer should be yes, but in fact, it's no. That's because both
NULL<>5 evaluate to FALSE. Technically, they evaluate to NULL, which gets interpretted in a boolean context as FALSE, but it's irrelevant to our point.
This makes it difficult to model a
Maybe value as a nullable field. The solution isn't too difficult; if you want to filter for all rows where column my_column is equal to some Nothing value, you specify
IS NULL instead of
=? and binding the Nothing.
However, when dealing with unique keys and the getBy function, it gets a bit more complicated. So for now, I have a simple compromise: nullable columns cannot be included in unique keys.
Less type classes
I've decided to backtrack a bit on my multi-type-class idea in the previous post. Instead, I'm having the TH code generate a single type class instance for each table.
I've also decided to simplify the data families a bit. Here's the motiviation: I want to be able to have a
PersonId datatype to go along with the
Person table. The previous definition of
data family Key value m, where m is the monad for the backend we're dealing with (such as Sqlite).
Unfortunately, that means the definition of PersonId because
type PersonId m = Key Person m, which is really irritating. So instead, Key is now simply
data family Key value.
The upshot of this is that there can only be one instance for each value in scope at a time. However, this really makes perfect sense: the TH code generates the datatypes! So in order to switch backends, you'll need to change your TH deriving line.
YAML entity declarations
I'll be honest: I really don't like this solution, and find it ugly. However, it was the fastest thing to implement, and I needed something to work with immediately for my current project. Hopefully I can put some real thought into a better syntax later. In fact, I think an EDSL would be better here than quasi-quoted syntax.
Here's an example that would declare two datatypes and create appropriate instances for the Sqlite3 backend.
persistSqlite3 [$persist| Author: columns: - name: name type: String order: [Asc] # can be sorted in ascending order - name: tagline type: String nullable: True uniques: UniqueAuthorName: [name] Entry: columns: - name: author type: AuthorId # this is declared automatically filter: [Eq] # you can select all entries by an author - name: date type: Day order: [Asc, Desc] - name: slug type: String - name: title type: String - name: content type: HtmlContent uniques: UniqueDateSlug: [date, slug] |]
Using HDBC... for now
To get started quickly, I'm using HDBC. The obvious advantages are:
- Automatic support for all HDBC backends.
- No need to write against the FFI.
- Well tested library.
- SqlValue and Convertible are defined for us.
The downsides are:
- Even with automatic support for the backends, we still need to be careful of writing compliant SQL. For example,
INTEGER PRIMARY KEYversus
- MySQL support is pretty spotty on HDBC if I'm not mistaken.
- It still doesn't help us with the non-SQL backends at all.
- We're probably incurring some some performance overhead.
- HDBC isn't very good at deterministic operation. It's difficult to tell when a statement will be finalized, and if you use the lazy I/O, there's lots of opportunities for bugs.
- It's not quite as bug-free as I'd like. Details available upon request.
- Ideally I'd like a PersistValue instead of SqlValue datatype that takes into account non-SQL backends.
- I don't like the design of Convertible: there's no distinction made between conversions which are guaranteed to succeed and those which might fail. This encourages errors ocurring in pure code. I proposed a different interface to John, and he seemed interested at the time, but nothing has come of it since.
I have a lot of experience with the Sqlite C API, and it's very easy to program against. I'm not so certain of MySQL and PostgreSQL, but frankly the hardest work in this project is the TH code, not the database code itself.
Of course, if we stick with HDBC it will be easier for people to drop down to straight SQL queries when persistent is too constraining.
It would be really nice to be able to automatically generate web forms from this tables, especially on my current project. This isn't very difficult in theory, I just have to figure it out.
I'm not sure if for this first release I should focus more on a number of backends, or on perfecting a single one. If I do multiple backends, it might point out places where the API favors a certain backend too much, but I'd really like to get something working out there quickly so that Yesod can have a persistence layer already.
And I still have to choose an enumerator interface for the select, filter and order functions. For those afraid of enumerators, I intend to offer a strict list variant as well. I'm tempted to join everyone else with the iteratee package... I'd appreciate community feedback on this one.