Transducers: Pipes versus Machines

In hindsite1, what I was trying to say in my last post was this:

Folds in the foldl library are Moore machines and what I often need is a Mealy machine.

data Fold a b = forall x . Fold (x -> a -> x) x (x -> b)
          ^ ^          ^        ^             ^ ^
      input output     state    step      begin done
                                

A user of the Fold type (scan in the pipes library is an example) has access to the computational results via done which is only hooked up to the state. A Moore machine consumes an input tape and provides a state (which Fold then provides access to). A Mealy machine, in contrast, has these components but is also an output tape producer. A Mealy version of Fold might look something like this:

data Fold' a b c = forall x . Fold (x -> a -> (x,b)) x (x -> c)
               ^                                 ^     ^
             ???                       output tape     done'
                                

or you could decide not to hook up state at all:

data Fold' a b = forall x . Fold (x -> a -> (x,b)) x
                                               ^
                             just an output tape

All of which gives me an excuse to take the machines library for a spin, make some comparisons with the pipes ecosystem, and opine on the metaphor of transducers.

Step (Reducer)

At the heart of my particular toy problem (summarizing an input tape of Ints by adding them up and producing an output tape value whenever they exceed 10) is a step function that I found to be common across libraries and typing decisions. Let’s give it a type that fits in with the Mealy argument ordering in machines.

(Code link 2 and imports3 at the end.)

-- an (explicit state) left step to kick things off
type Step x a b = x -> a -> (b,x)

myStep :: Step Int Int [Int]
myStep x a =
  let x' = x + a in
  if x' >= 10
     then ([x'],0)
     else ([],x')

In some circles this is referred to as a reducer, though I don’t see what is being “reduced”. More natural (if possible) is to see step busted into (a -> x), a morphism to state, and (x -> x -> x), a state Endo.

I can use this to build a machines library Mealy using unfoldMealy :: (s -> a -> (b, s)) -> s -> Mealy a b

myMealy :: Mealy Int [Int]
myMealy = unfoldMealy myStep 0

or I can roll my own (hiding state to avoid those dang skolems)

data Mealy' a b = forall x . Mealy' x (x -> a -> (b,x))

myMealy' :: Mealy' Int [Int]
myMealy' = Mealy' 0 myStep

Machines

To see Mealy in action you need an input tape and some machinery to run the tape.

mealy' :: [Int] -> [[Int]]
mealy' ns = runIdentity $ M.runT $ M.supply ns (M.auto myMealy)

I’m ignoring (loss of) final state to keep things simple.

λ> mealy' [4, 5, 2, 1, 8, 0, 4, 3]
[[],[],[11],[],[],[],[13],[]]

For a first time user of machines, that was easy!

Pipes

Pipes and machines are at least first-cousins when it comes to the problem domains they tackle. Here’s my Mealy inserted directly into a Pipe:

toPipe :: (Monad m, Ord a, Num a) => Mealy' a [a] -> Pipe a [a] m ()
toPipe (Mealy' begin step) =
  go begin
  where
    go x = do
      a <- await
      let (b,x') = step x a
      yield b
      go x'

pipe :: [Int] -> [[Int]]
pipe ns = Pipes.toList (each ns >-> toPipe myMealy') 

Transducers

I need a collective noun to describe pipes and machines, and I’m going to opt for transducers and then use the term transduction to describe the application of a specific transducer computation.

Inserting an effect into each transduction, rather than cheating by using ghci to show a pure result gives:

-- machines versus pipes
printMachine :: (Show a) => M.MachineT IO (M.Is [a]) ()
printMachine = M.repeatedly $ M.await >>= lift . print >> M.yield ()

machinePrint :: [Int] -> IO ()
machinePrint ns = M.runT_ $ M.supply ns (M.auto myMealy) M.~> printMachine

printPipe :: (Show a) => Consumer a IO ()
printPipe = forever $ await >>= lift . print

pipesPrint :: [Int] -> IO ()
pipesPrint ns = runEffect $ each ns >-> toPipe myMealy' >-> printPipe 

The print effects are almost identical and there’s clear correspondence in constructing each transduction. I couldn’t really find a consumer in machines (just one enigmatic note in the docs that consumers are just monoids - huh?), but consumers are over-rated and it’s often easier to avoid Consumers in pipes and just trail the transduction with a forever await discard.

Transducer transformers

Labeling them first-cousins might be conservative. Could the two libraries be isomorphic? I have no idea, but two examples to get you thinking:

You can turn a Producer into a Source (it came out as a Plan but same diff):

producerToSource :: Monad m => Producer b m r -> M.PlanT k b m r
producerToSource p =
  runEffect $
  hoist lift p >->
  forever (do
      a <- await
      lift $ M.yield a)

You can “tee” a machines Process, providing a Producer:

prodTee :: (Monad m) => M.MachineT (Producer a m) (M.Is a) ()
prodTee = void $ M.repeatedly $ do
  a <- M.await
  lift . yield $ a
  M.yield ()

fromProcess :: (Monad m) => Producer a m ()
fromProcess = M.runT_ prodTee

How’s that for the power of monad transformation! It’s a pain to work with them, but when you find the right hoist, it’s sheer magic.

Machines versus Pipes (versus Skolems versus State versus foldl’)

I have identical transductions so it’s time for a race!

First a dump of the results including a few alternatives, then some explaining:

 ~/git/sfold dev master*
 # ghc -O2 --make examples/mealy.hs -package-db .cabal-sandbox/*-ghc-7.8.2-packages.conf.d/ -rtsopts -auto -auto-all -prof
 [1 of 1] Compiling Main             ( examples/mealy.hs, examples/mealy.o )
 Linking examples/mealy ...
 
 # examples/mealy 1 1000000 +RTS -t
 func                   n	mutat	gc	speed
 machines        	1.00e6	2.77e-1	7.71e-2	3.54e-7
 pipes           	1.00e6	2.34e-1	5.34e-3	2.39e-7
 pipe - state    	1.00e6	2.89e-1	6.64e-3	2.95e-7
 pipe - bad state	1.00e6	6.93e0	1.37e-1	7.07e-6
 foldl           	1.00e6	1.35e-1	2.14e-1	3.49e-7
 pipe & skolems  	1.00e6	2.46e-1	5.71e-3	2.52e-7
 just a fold     	1.00e6	4.39e-2	4.94e-2	9.33e-8 

 # examples/mealy 100 10000 +RTS -t
 func                   n	mutat	gc      speed
 machines        	1.00e6	2.59e-1	7.52e-3	2.66e-7
 pipes           	1.00e6	2.34e-1	5.28e-3	2.39e-7
 pipe - state    	1.00e6	2.85e-1	6.29e-3	2.91e-7
 pipe - bad state	1.00e6	6.59e0	1.34e-1	6.73e-6
 foldl           	1.00e6	1.25e-1	9.68e-2	2.22e-7
 pipe & skolems  	1.00e6	2.48e-1	5.45e-3	2.54e-7
 just a fold     	1.00e6	4.58e-2	5.78e-2	1.04e-7

I use a bespoke criterion-based benchmark for most of my performance testing4. The 1.0 release saw an opening up of the library, and access to lower-level benchmarking measures, which is awesome. In particular, you can get time split into mutation (real work as the docs say) and garbage collection. Remember to use +RTS -t if you try this at home.

machines vs pipes

So the first set of numbers is 1 sample of a 1 million values tape. Not counting GC, Pipes wins with 0.234 secs per mill (234 nanos per tick), versus a very respectable 277 nanos for machines. The second set of numbers are 100 samples of a 10000 value tape. Pipes is rock-steady - no time and space leaks here! Machines has a slight improvement in mutation, suggesting some time leak within the transduction, and a difference in GC suggesting a space leak. The slowdown by a factor of 10 for a reduction of 100 in the tape size is indicative of a quadratic blowup somewhere (probably my bad).

tail recursion vs state

The translation of Mealy to a pipe could have been achieved by including state in the pipe, rather than using tail recursion:

-- pipes using state (lifting pipes)
toPipe'' :: (Monad m, Ord a, Num a) => Mealy' a [a] -> Pipe a [a] m ()
toPipe'' (Mealy' begin step) =
  flip evalStateT begin $ forever $ do
    a <- lift await
    x <- get
    let (b,x') = step x a
    lift $ yield b
    put x'

And this convenience slows things down a touch (289 vs 234 nanos).

Note that I lifted pipes’ish stuff. If you lift the non-pipes elements, bad stuff happens (289 versus 6930 nanos). Don’t lift pipes if you can help it.

-- pipes using state (lifting state)
toPipe' :: (Monad m, Ord a, Num a) => Mealy' a [a] -> Pipe a [a] m ()
toPipe' (Mealy' begin step) =
  flip evalStateT begin $ distribute $ forever $ do
    a <- await
    x <- lift get
    let (b,x') = step x a
    yield b
    lift $ put x'

foldl vs pipes

An obvious difference between pipes and machines is that machine integrates reducers (Moore and Mealy) whilst foldl is separated out from pipes.

-- just a Foldl.fold
toFoldl :: Mealy' a b -> Foldl.Fold a [b]
toFoldl (Mealy' begin step) = Foldl.Fold step' begin' done'
  where
    begin' = ([],begin)
    step' (output,acc) a = (\(b,acc') -> (b:output,acc')) $ step acc a
    done' = reverse . fst

foldl' :: [Int] -> [[Int]]
foldl' = Foldl.fold (toFoldl myMealy')

Squeezing Mealy into a Moore’ish Foldl.Fold clocked in at a blistering 135 nanos mutation, but with a GC problem, most likely in my step function specification.

skolems

-- escaping skolems
data Mealy'' a b x = Mealy'' x (a -> x -> (x,b))

myMealy'' :: Mealy'' Int [Int] Int
myMealy'' = Mealy'' 0 (\a x -> swap (myStep a x))

toPipeWithSkolem :: (Monad m, Ord a, Num a) => Mealy'' a [a] a-> Pipe a [a] m ()
toPipeWithSkolem (Mealy'' begin step) =
  go begin
  where
    go x = do
      a <- await
      let (x',b) = step a x
      yield b
      go x'

skolem :: [Int] -> [[Int]]
skolem ns = Pipes.toList (each ns >-> toPipeWithSkolem myMealy'')

Exposing state slowed things down a touch, 246 nanos versus 234 nanos - right on tekmo’s rule-of-thumb. But this is a simple step function, and given that’s it’s only 5% of the total cost, just maybe the cost is worthwhile to users tying themselves in knots over escaping skolems. Free skolem-kind I say!

just a fold

Of course, if you can do without the notion of transducers, reducers, machines and tapes, then you can just fold the values at 93 nanos versus 239 nanos with transducers hanging off the computation (and I might have introduced a space problem that shows up in GC).

-- foldl'
-- myStep without an output tape
myStep' :: ([[Int]],Int) -> Int -> ([[Int]],Int) 
myStep' (out,acc) a =
  let acc' = acc + a in
  if acc' >= 10
  then ([acc']:out,0)
  else ([]:out,acc')

fold :: [Int] -> [[Int]]
fold ns = fst $ F.foldl' myStep' ([],0) ns

just c

I wanted to include the machine-level limit. Low value runs were coming in at 9 nanos but I got a segment fault at higher numbers and reminded myself why I code in haskell.

And the winner is …

It was an interesting tussle in the battle of the transducers. Pipes/foldl wins the speed race but machines has very respectable numbers, especially given a certain lack of attention. Machines gets points for ease of use, and for having a Mealy machine at all. Pipes gets bonuses for separation of transducer and folder. Both libraries get kudos for incredible flexibility in monad transformation.

Haskell is the winner everywhere except for lacking a stable Step definition. I now know that Folds/Reducers/Steps are also (more or less) Moore machines, but it was a difficult factoid to locate. The pipes ecosystem is missing out on the benefits of the Mealy - that you can stick an output tape into your transduction and increase the expressive power compared with existing tools. But it’s taken a lot of digging to reach that conclusion.

Moore and Mealy and the design patterns around them deserve wide exposure and their own library, called Step or Reducer or whatever.

Afterword

A while back there was some commentary on clojure reducers (x -> a -> x) and transducers (x -> a -> x) -> (x -> b -> x), and it was pointed out that clojure reducers are Folds in the lens and foldl sense, without the finalization (and state initialization) goodness built in to enable composition.

It may well be that the clojure notion of transducers is, in part, an alternative to passing around a complete Fold with finalization and state; instead allowing reducer receivers to handle that aspect. But they also might be onto something, though haven’t gone far enough. In the world of pipes, it’s becoming idiomatic to pass around Producers with something like Producer a m () -> Producer b m () (and often easier than pull-based pipes), either directly or via lens like Lens' (Producer a m ()) (Producer b m ()).

Regardless, pipes and machines take signals and process them, conforming to the classical metaphor of a transducer in electrical circuits, and transducer is the right name for what they are.


  1. What I missed was this post from kmett. I did read it at the time, but failed to heed the scary part warning. Then, when I read that Moore machines were cofree comonads, my brain exploded.

  2. full code here

  3. import list

    import           Control.Applicative
    import           Control.DeepSeq
    import qualified Control.Foldl as Foldl
    import           Control.Monad.Identity
    import           Control.Monad.Trans.State.Strict
    import           Criterion
    import           Criterion.Measurement
    import           Criterion.Types
    import qualified Data.Foldable as F
    import qualified Data.Machine as M
    import           Data.Machine.Mealy
    import           Data.Text (Text)
    import qualified Data.Text.IO as Text
    import           Data.Tuple (swap)
    import           Formatting
    import           Pipes
    import           Pipes.Lift
    import qualified Pipes.Prelude as Pipes
    import           System.Environment
    
  4. criterion

    import           Criterion
    import           Criterion.Measurement
    import           Criterion.Types
    import           Formatting
    
    --criterion helpers
    data Speed =
        Speed
        { _speedMutator :: Double
        , _speedGc      :: Double
        } deriving (Show)
    
    speed' :: (Integral t) => Control.DeepSeq.NFData b => t -> (a -> b) -> a -> IO Speed
    speed' nSamples f a = do
        (m,_) <- Criterion.Measurement.measure (nf f a) (fromIntegral nSamples)
        return $ (\x -> Speed (measMutatorCpuSeconds x) (measGcCpuSeconds x)) m
    
    render' :: String -> Int -> Speed -> Text
    render' label n speed =
        sformat
        (string %"\t"% expt 2 %"\t"% expt 2 %"\t"% expt 2 %"\t"% expt 2 %"\n")
        label
        n
        (_speedMutator speed)
        (_speedGc speed)
        ((_speedGc speed + _speedMutator speed) / fromIntegral n)
    
    -- speed test
    main :: IO ()
    main = do
      (ns':n':_) <- getArgs
      let ns = read ns'
          n = read n'
          t = ns*n
          str = replicate n 1
      Text.putStrLn "func\t\t\tn\tmutat\tgc\tspeed"
      Text.putStr =<< render' "machines        " t <$> speed' ns mealy' str
      Text.putStr =<< render' "pipes           " t <$> speed' ns pipe   str
      Text.putStr =<< render' "pipe - state    " t <$> speed' ns pipe'' str
      Text.putStr =<< render' "pipe - bad state" t <$> speed' ns pipe'  str
      Text.putStr =<< render' "foldl           " t <$> speed' ns foldl' str
      Text.putStr =<< render' "pipe & skolems  " t <$> speed' ns skolem str
      Text.putStr =<< render' "just a fold     " t <$> speed' ns fold str
    
January 24, 2015
Tony Day