The Resource monad transformer
The Resource transformer (ResourceT) plays a vital role in proper resource
management in the conduit project. It is included within the conduit package
itself. We'll explaining ResourceT as its own entity. While some of the design
decisions clearly are biased towards conduits, ResourceT should remain a usable
tool in its own right.
Goals
What's wrong with the following code?
import System.IO main = do output <- openFile "output.txt" WriteMode input <- openFile "input.txt" ReadMode hGetContents input >>= hPutStr output hClose input hClose output
If the file input.txt does not exist, then an exception will be thrown
when trying to open it. As a result, hClose output will never be called, and
we'll have leaked a scarce resource (a file descriptor). In our tiny program, this isn't a big
deal, but clearly we can't afford such waste in a long running, highly active server process.
Fortunately, solving the problem is easy:
import System.IO main = withFile "output.txt" WriteMode $ \output -> withFile "input.txt" ReadMode $ \input -> hGetContents input >>= hPutStr output
withFile makes sure that the Handle is always closed, even in
the presence of exceptions. It also handles asynchronous exceptions. Overall, it's a great
approach to use... when you can use it. While often withFile is easy to use,
sometimes it can require restructuring our programs. And this restructuring can range from mildly
tedious to wildly inefficient.
Let's take enumerators for example. If you look in the documentation, there is an
enumFile function (for reading contents from a file), but no
iterFile (for writing contents to a file). That's because the flow of control
in an iteratee doesn't allow proper allocation of the Handle. Instead, in order to write to a
file, you need to allocate the Handle before entering the Iteratee, e.g.:
import System.IO import Data.Enumerator import Data.Enumerator.Binary main = withFile "output.txt" WriteMode $ \output -> run_ $ enumFile "input.txt" $$ iterHandle output
This code works fine, but imagine that, instead of simply piping data directly to the file, there was a huge amount of computation that occurred before we need to use the output handle. We will have allocated a file descriptor long before we needed it, and thereby locked up a scarce resource in our application. Besides this, there are times when we can't allocate the file before hand, such as when we won't know which file to open until we've read from the input file.
One of the stated goals of conduits is to solve this problem, and it does so via
ResourceT. As a result, the above program can be written in conduit as:
{-# LANGUAGE OverloadedStrings #-} import Data.Conduit import Data.Conduit.Binary main = runResourceT $ sourceFile "input.txt" $$ sinkFile "output.txt"
How it Works
There are essentially three base functions on ResourceT, and then a bunch of
conveniences thrown on top. The first function is:
register :: IO () -> ResourceT IO ReleaseKey
This function registers a piece of code that it asserts must be run. It gives back a
ReleaseKey, which is used by the next function:
release :: ReleaseKey -> ResourceT IO ()
Calling release on a ReleaseKey immediately performs the
action you previously registered. You may call release on the same
ReleaseKey as many times as you like; the first time it is called, it
unregisters the action. This means you can safely register an action like a memory
free, and have no concerns that it will be called twice.
Eventually, we'll want to exit our special ResourceT. To do so, we use:
runResourceT :: ResourceT IO a -> IO a
This seemingly innocuous function is where all the magic happens. It runs through all of the
registered cleanup actions and performs them. It is fully exception safe, meaning the cleanups
will be performed in the presence of both synchronous and asynchronous exceptions. And as
mentioned before, calling release will unregister an action, so there is no
concern of double-freeing.
Finally, as a convenience, we provide one more function for the common case of allocating a resource and registering a release action:
with :: IO a -- ^ allocate -> (a -> IO ()) -- ^ free resource -> ResourceT IO (ReleaseKey, a)
So, to rework our first buggy example to use ResourceT, we would write:
import System.IO import Control.Monad.Trans.Resource import Control.Monad.Trans.Class (lift) main = runResourceT $ do (releaseO, output) <- with (openFile "output.txt" WriteMode) hClose (releaseI, input) <- with (openFile "input.txt" ReadMode) hClose lift $ hGetContents input >>= hPutStr output release releaseI release releaseO
Now there is no concern of any exceptions preventing the releasing of resources. We could skip
the release calls if we want to, and in an example this small, it would not make
any difference. But for larger applications, where we want processing to continue, this ensures
that the Handles are freed as early as possible, keeping our scarce resource
usage to a minimum.
Some Type Magic
As alluded to, there's a bit more to ResourceT than simply running in
IO. Let's cover some of the things we need from this underlying
Monad.
- Mutable references to keep track of the registered release actions. You might think we could
just use a
StateTtransformer, but then our state wouldn't survive exceptions. - We only want to register actions in the base monad. For example, if we have a
ResourceT (WriterT [Int] IO)stack, we only want to registerIOactions. This makes it easy to lift our stacks around (i.e., add an extra transformer to the middle of an existing stack), and avoids confusing issues about the threading of other monadic side-effects. - Some way to guarantee an action is performed, even in the presence of exceptions. This boils
down to needing a
bracket-like function.
For the first point, we define a new typeclass to represent monads that have mutable references:
class Monad m => HasRef m where type Ref m :: * -> * newRef' :: a -> m (Ref m a) readRef' :: Ref m a -> m a writeRef' :: Ref m a -> a -> m () modifyRef' :: Ref m a -> (a -> (a, b)) -> m b mask :: ((forall a. m a -> m a) -> m b) -> m b mask_ :: m a -> m a try :: m a -> m (Either SomeException a)
We have an associated type to signify what the reference type should be. (For fans of fundeps,
you'll see in the next section that this has to be an associated type.) Then we provide a
number of basic reference operations. Finally, there are some functions to help with exceptions,
which are needed to safely implement the functions described in the last section. The instance
for IO is very straight-forward:
instance HasRef IO where type Ref IO = I.IORef newRef' = I.newIORef modifyRef' = I.atomicModifyIORef readRef' = I.readIORef writeRef' = I.writeIORef mask = E.mask mask_ = E.mask_ try = E.try
However, we have a problem when it comes to implementing the ST instance:
there is no way to deal with exceptions in the ST monad. As a result,
mask, mask_ and try are given default
implementations that do no exception checking. This gives rise to the first word of warning:
operations in the ST monad are not exception safe. You should not be allocating scarce
resources in ST when using ResourceT. You might be wondering why bother with
ResourceT at all then for ST. The answer is that there is a
lot you can do with conduits without allocating scarce resources, and ST is a
great way to do this in a pure way. But more on this later.
Now onto point 2: we need some way to deal with this base monad concept. Again, we use an associated type (again explained in the next section). Our solution looks something like:
class (HasRef (Base m), Monad m) => Resource m where type Base m :: * -> * resourceLiftBase :: Base m a -> m a
But we forgot about point 3: some bracket-like function. So we need one more
method in this typeclass:
resourceBracket_ :: Base m a -> Base m b -> m c -> m c
The reason the first two arguments to resourceBracket_ (allocation and
cleanup) live in Base m instead of m is that, in
ResourceT, all allocation and cleanup lives in the base monad.
So on top of our HasRef instance for IO, we now need a
Resource instance as well. This is similarly straight-forward:
instance Resource IO where type Base IO = IO resourceLiftBase = id resourceBracket_ = E.bracket_
We have similar ST instances, with resourceBracket_ having no
exception safety. The final step is dealing with monad transformers. We don't need to provide a
HasRef instance, but we do need a Resource instance. The
tricky part is providing a valid implementation of resourceBracket_. For this,
we use some functions from monad-control:
instance (MonadTransControl t, Resource m, Monad (t m)) => Resource (t m) where type Base (t m) = Base m resourceLiftBase = lift . resourceLiftBase resourceBracket_ a b c = control' $ \run -> resourceBracket_ a b (run c) where control' f = liftWith f >>= restoreT . return
For any transformer, its base is the base of its inner monad. Similarly, we lift to the base by
lifting to the inner monad and then lifting to the base from there. The tricky part is the
implemetnation of resourceBracket_. I will not go into a detailed explanation,
as I would simply make a fool of myself.
Definition of ResourceT
We now have enough information to understand the definition of ResourceT:
newtype ReleaseKey = ReleaseKey Int type RefCount = Int type NextKey = Int data ReleaseMap base = ReleaseMap !NextKey !RefCount !(IntMap (base ())) newtype ResourceT m a = ResourceT (Ref (Base m) (ReleaseMap (Base m)) -> m a)
We see that ReleaseKey is simply an Int. If you skip a few
lines down, this will make sense, since we're using an IntMap to keep track of
the registered actions. We also define two type synonyms: RefCount and
NextKey. NextKey keeps track of the most recently assigned
value for a key, and is incremented each time register is called. We'll touch on
RefCount later.
The ReleaseMap is three pieces of information: the next key and the reference
count, and then the map of all registered actions. Notice that ReleaseMap takes
a type parameter base, which states which monad release actions must live
in.
Finally, a ResourceT is essentially a ReaderT that keeps a
mutable reference to a ReleaseMap. The reference type is determined by the base
of the monad in question, as is the cleanup monad. This is why we need to use associated
types.
The majority of the rest of the code in the Control.Monad.Trans.Resource
module is just providing instances for the ResourceT type.
Other Typeclasses
There are three other typeclasses provided by the module:
- ResourceUnsafeIO
- Any monad which can lift
IOactions into it, but that this may be considered unsafe. The prime candidate here isST. Care should be taken to only lift actions which do not acquire scarce resources and which don't "fire the missiles." In other words, all the normal warnings ofunsafeIOToSTapply. - ResourceThrow
- For actions that can throw exceptions. This automatically applies to all
IO-based monads. ForST-based monads, you can use the suppliedExceptionTtransformer to provide exception-throwing capabilities. Some functions in conduit, for example, will require this (e.g., text decoding). - ResourceIO
- A convenience class tying together a bunch of other classes, included the two mentioned above. This is purely for convenience; you could achieve the same effect without this type class, you'd just have to do a lot more typing.
Forking
It would seem that forking a thread would be inherently unsafe with ResourceT,
since the parent thread may call runResourceT while the child thread is still
accessing some of the allocated resources. This is indeed true, if you use the normal
forkIO function.
In order to solve this, ResourceT includes reference counting. When you fork a
new thread via resourceForkIO, the RefCount value of the
ReleaseMap is incremented. Every time runResourceT is called,
the value is decremented. Only when the value hits 0 are all the release actions called.
Convenience Exports
In addition to what's been listed so far, there are a few extra functions exported (mostly) for convenience.
newRef,writeRef, andreadRefwrap up theHasRefversions of the functions and allow them to run in anyResourceT.withIOis essentially a type-restricted version ofwith, but working around some of the nastiness with types you would otherwise run into. In general: you'll want to usewithIOwhen writingIOcode.transResourceTlet's you modify which monad your ResourceT is running in, assuming it keeps the same base.transResourceT :: (Base m ~ Base n) => (m a -> n a) -> ResourceT m a -> ResourceT n a transResourceT f (ResourceT mx) = ResourceT (\r -> f (mx r))