转载

Scalaz（17）－ Monad：泛函状态类型－State Monad

我们经常提到函数式编程就是F[T]。这个F可以被视为一种运算模式。我们是在F运算模式的壳子内对T进行计算。理论上来讲，函数式程序的运行状态也应该是在这个运算模式壳子内的，也是在F[]内更新的。那么我们就应该像函数式运算T值一样，也有一套函数式更新程序状态的方法。之前我们介绍了Writer Monad。Writer也是在F[]内维护Log的，可以说是一种状态维护方式。但Writer的Log是一种Monoid类型，只支持Semigroup的a|+|b操作，所以只能实现一种两段Log相加累积这种效果。WriterT的款式是这样的：

 final case class WriterT[F[_], W, A](run: F[(W, A)]) { self => ...

Writer是WriterT的一个F[_] >>> Id特例，那么它的款式也可以被视作这样：

 final case class Writer[W, A](run: (W, A)) { self =>

注意这个(W,A)参数，这是一种典型的函数式编程状态维护方式。因为函数式编程强调使用不可变数据（immutable），所以维护状态的方式就是传入当前状态值W然后必须返回新的状态值。由于Writer是个Monad，通过flatMap可以把状态值W在运算之间连续下去。这点我们可以从WriterT的flatMap函数得出：

   def flatMap[B](f: A => WriterT[F, W, B])(implicit F: Bind[F], s: Semigroup[W]): WriterT[F, W, B] =     flatMapF(f.andThen(_.run))    def flatMapF[B](f: A => F[(W, B)])(implicit F: Bind[F], s: Semigroup[W]): WriterT[F, W, B] =     writerT(F.bind(run){wa =>       val z = f(wa._2)       F.map(z)(wb => (s.append(wa._1, wb._1), wb._2))     })

以上的flatMapF函数把上一个运算的W与下一个运算的W用Monoid操作结合起来s.append(wa._1,wb._1)。Writer类型款式的一个特点就是这个（W,A）返回类型，就是把状态和运算值传入再同时返回。不过对状态的操作只能局限在Monoid操作。曾经提到过Writer还可以被理解成一种特别的状态维护，只是目标锁定在了Log的更新。那么真正意义的状态类型State Monad又是怎样的呢？我们先看看State是怎样定义的：scalaz/package.scala

   type StateT[F[_], S, A] = IndexedStateT[F, S, S, A]   type IndexedState[-S1, S2, A] = IndexedStateT[Id, S1, S2, A]   /** A state transition, representing a function `S => (S, A)`. */   type State[S, A] = StateT[Id, S, A]

State是StateT的Id特殊案例，而StateT又是IndexedStateT的S1=S2特殊案例。那我们就从最概括的类型IndexedStateT开始介绍吧。下面是IndexedStateT的定义：scalaz/StateT.scala

 trait IndexedStateT[F[_], -S1, S2, A] { self =>   /** Run and return the final value and state in the context of `F` */   def apply(initial: S1): F[(S2, A)]    /** An alias for `apply` */   def run(initial: S1): F[(S2, A)] = apply(initial)    /** Calls `run` using `Monoid[S].zero` as the initial state */   def runZero[S <: S1](implicit S: Monoid[S]): F[(S2, A)] =     run(S.zero)    /** Run, discard the final state, and return the final value in the context of `F` */   def eval(initial: S1)(implicit F: Functor[F]): F[A] =     F.map(apply(initial))(_._2)    /** Calls `eval` using `Monoid[S].zero` as the initial state */   def evalZero[S <: S1](implicit F: Functor[F], S: Monoid[S]): F[A] =     eval(S.zero)    /** Run, discard the final value, and return the final state in the context of `F` */   def exec(initial: S1)(implicit F: Functor[F]): F[S2] =     F.map(apply(initial))(_._1)    /** Calls `exec` using `Monoid[S].zero` as the initial state */   def execZero[S <: S1](implicit F: Functor[F], S: Monoid[S]): F[S2] =     exec(S.zero) ...

IndexedStateT的抽象函数是这个apply(initial:S1):F[(S2,A)]，它的函数款式是：S1=>F[(S2,A)]，意思是传入S1,把结果包在F里以F[(W,A)]返回。如果F[]=Id的话，那就是S1=>(S2,A)了。函数run就是apply，就是一种状态运算函数：传入状态S1，通过运算返回计算值A和新状态S2，并把结果包在F[(S2,A)]里。其它函数都是用来获取新的运算值或新状态的，如：eval返回F[A]，exec返回F[S2]。值得注意的是这个F必须是Functor才行，因为我们必须用map才能在F[]内更新运算值或状态。当然，如果我们使用State类型的话，F就是Id，那么run=>(s,a),eval=>a,exec=>s。与Writer比较，State Monad通过一个状态运算函数功能要强大得多了，运用也要灵活许多。

我们再来看看IndexedStateT的map,flatMap:

  def map[B](f: A => B)(implicit F: Functor[F]): IndexedStateT[F, S1, S2, B] = IndexedStateT(s => F.map(apply(s)) {     case (s1, a) => (s1, f(a))   })  def flatMap[S3, B](f: A => IndexedStateT[F, S2, S3, B])(implicit F: Bind[F]): IndexedStateT[F, S1, S3, B] = IndexedStateT(s => F.bind(apply(s)) {     case (s1, a) => f(a)(s1)   })

特别注意flatMap：F必须是Monad，这样就可以在连接两个IndexedStateT时先后运行它们的状态运算函数S1=>F[(S2,A)]，即：apply(s)和f(a)(s1)。

如果不出意料的话，IndexedStateT的构建方式就是传入一个状态运算函数S1=>F[(S2,A)]：

 object IndexedStateT extends StateTInstances with StateTFunctions {   def apply[F[_], S1, S2, A](f: S1 => F[(S2, A)]): IndexedStateT[F, S1, S2, A] = new IndexedStateT[F, S1, S2, A] {     def apply(s: S1) = f(s)   } }

传入的函数f实现了抽象函数run使IndexedStateT实例化。

State Monad应该需要一套读写、传递状态的方法。这些方法可以在MonadState trait里找到：scalaz/MonadState.scala

 trait MonadState[F[_,_],S] extends Monad[({type f[x]=F[S,x]})#f] {   def state[A](a: A): F[S, A] = bind(init)(s => point(a))   def constantState[A](a: A, s: => S): F[S, A] = bind(put(s))(_ => point(a))   def init: F[S, S]   def get: F[S, S]   def gets[A](f: S => A): F[S, A] = bind(init)(s => point(f(s)))   def put(s: S): F[S, Unit]   def modify(f: S => S): F[S, Unit] = bind(init)(s => put(f(s))) }  object MonadState {   def apply[F[_,_],S](implicit F: MonadState[F, S]) = F }

MonadState是个抽象类型，因为它继承了Monad类但并没有实现Monad的抽象函数point和bind。所以这些状态维护函数必须在MonadState子类实例存在的情况下才能使用。这个情况在object MonadState里的apply函数的隐式参数F可以推断得出。IndexedStateT就是MonadState的子类，所以通过IndexedStateT的实例来施用状态运算函数是没用什么问题的。以下是这些操作函数的实现：

 private trait StateTMonadState[S, F[_]] extends MonadState[({type f[s, a] = StateT[F, s, a]})#f, S] {   implicit def F: Monad[F]    def bind[A, B](fa: StateT[F, S, A])(f: A => StateT[F, S, B]): StateT[F, S, B] = fa.flatMap(f)    def point[A](a: => A): StateT[F, S, A] = {     lazy val aa = a     StateT(s => F.point(s, aa))   }    def init: StateT[F, S, S] = StateT(s => F.point((s, s)))    def get = init    def put(s: S): StateT[F, S, Unit] = StateT(_ => F.point((s, ())))    override def modify(f: S => S): StateT[F, S, Unit] = StateT(s => F.point((f(s), ())))    override def gets[A](f: S => A): StateT[F, S, A] = StateT(s => F.point((s, f(s)))) }

我们现在可以尝试一些简单的State Monad使用案例，先试着模仿一个数字堆栈(Integer Stack)操作：

 1   type Stack = List[Int] 2   def pop: State[Stack, Int] = State { case h::t => (t,h) } 3                                                   //> pop: => scalaz.State[Exercises.stateT.Stack,Int] 4   def push(a: Int): State[Stack, Unit] = State { xs => (a :: xs, ()) } 5                                                   //> push: (a: Int)scalaz.State[Exercises.stateT.Stack,Unit]

pop和push操作结果都是State，State是Monad，这样我们就可以用for-comprehension来演示具体操作了：

  1  val prg = for {  2     _ <- push(1)  3     _ <- push(2)  4     _ <- push(3)  5     a <- pop  6     b <- get  7     _ <- pop  8     _ <- put(List(9))  9   } yield b                                       //> prg  : scalaz.IndexedStateT[scalaz.Id.Id,Exercises.stateT.Stack,List[Int],E 10                                                   //| xercises.stateT.Stack] = scalaz.IndexedStateT$$anon$10@72d1ad2e 11   prg.run(List())                                 //> res2: scalaz.Id.Id[(List[Int], Exercises.stateT.Stack)] = (List(9),List(2,  12                                                   //| 1))

prg只是一段功能描述，因为状态运算函数是个lambda: s => (s,a)。这里s是个未知数，它在for loop里逐层传递下去。运算结果需要通过运行run函数并提供初始状态值List()后才能获取，也就是说真正的运算是在运行run时才开始的。我们称run为程序prg的翻译器（interpreter），这是函数式编程的典型模式，这样可以把具体运算延到最后。

我们再看看如何读写状态：

  1   val prg = for {  2     _ <- push(1)  3     _ <- push(2)  4     _ <- push(3)  5     a <- pop  6     b <- get     //(s,s)  7     c <- gets { s:Stack => s.length} //(s,s.length)  8     _ <- pop  9     _ <- put(List(9))  //(List(9),a) 10     _ <- modify {s:Stack => s ++ List(10) } //(List(9,10),a) 11   } yield c                                       //> prg  : scalaz.IndexedStateT[scalaz.Id.Id,Exercises.stateT.Stack,List[Int],I 12                                                   //| nt] = scalaz.IndexedStateT$$anon$10@72d1ad2e 13   prg.run(List())                                 //> res2: scalaz.Id.Id[(List[Int], Int)] = (List(9, 10),2)

实际上在StateT里已经实现了filter函数，可以看看下面的例子：

 1 val prg1 = for { 2     _ <- push(1) 3     _ <- push(2) 4     _ <- push(3) 5     a <- pop 6     b <- if (a == 3 ) put(List(1,2,3)) else put(List(2,3,4)) 7   } yield b                                       //> prg1  : scalaz.IndexedStateT[scalaz.Id.Id,Exercises.stateT.Stack,List[Int], 8                                                   //| Unit] = scalaz.IndexedStateT$$anon$10@3349e9bb 9   prg1.run(List())                                //> res4: scalaz.Id.Id[(List[Int], Unit)] = (List(1, 2, 3),())

因为StateT实现了MonadPlus实例：scalaz/StateT.scala

 private trait StateTMonadStateMonadPlus[S, F[_]] extends StateTMonadState[S, F] with StateTHoist[S] with MonadPlus[({type λ[α] = StateT[F, S, α]})#λ] {   implicit def F: MonadPlus[F]    def empty[A]: StateT[F, S, A] = liftM[F, A](F.empty[A])    def plus[A](a: StateT[F, S, A], b: => StateT[F, S, A]): StateT[F, S, A] = StateT(s => F.plus(a.run(s), b.run(s))) }

当然，这个StateT的F必须是MonadPlus实例。liftM能把Monad生格成StateT:

   def liftM[G[_], A](ga: G[A])(implicit G: Monad[G]): StateT[G, S, A] =     StateT(s => G.map(ga)(a => (s, a)))

IndexedStateT还有一个挺有趣的函数lift。在FP风格里lift总是起到搭建OOP到FP通道的作用。我们先来看个例子：

 1  def incr: State[Int,Int] = State { s => (s+1,s)}//> incr: => scalaz.State[Int,Int] 2   incr.replicateM(10).evalZero[Int]               //> res3: List[Int] = List(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)

从运算结果来看还是正常的。但如果我这样用：

   incr.replicateM(10000).runZero[Int]             //> java.lang.StackOverflowError

啊！StackOverflowError。解决堆栈溢出其中一个方法是使用Trampoline结构，以heap换stack。Trampoline就是Free Monad的一个特殊案例，我们后面会详细介绍Free Monad。现在可以用lift把F[(S2,A)]升格成M[F[(S2,A)]]:

   def lift[M[_]: Applicative]: IndexedStateT[({type λ[α]=M[F[α]]})#λ, S1, S2, A] = new IndexedStateT[({type λ[α]=M[F[α]]})#λ, S1, S2, A] {     def apply(initial: S1): M[F[(S2, A)]] = Applicative[M].point(self(initial))   }

我们可以把State返回类型升格成为Trampoline，就像这样：

 1  import scalaz.Free.Trampoline 2   incr.lift[Trampoline].replicateM(10).evalZero[Int] 3                                                   //> res4: scalaz.Free[Function0,List[Int]] = Gosub()

现在看看解决了StackOverflowError问题没有：

   import scalaz.Free.Trampoline   incr.lift[Trampoline].replicateM(10000).evalZero[Int].run.take(5)                                                   //> res4: List[Int] = List(0, 1, 2, 3, 4)

问题解决。注意上面的表达式后面加多了一个run指令，这是因为现在返回的类型已经是Trampoline了。再看另外一个例子，我们用State在List里添加行号：

 1  def zipIndex[A](xs: List[A]): List[(A, Int)] = 2     xs.foldLeft(State.state[Int,List[(A,Int)]](List()))( 3        (acc, a) => for { 4           xn <- acc 5           n <- get[Int] 6           _ <- put[Int](n+1) 7      } yield (a,n) :: xn).evalZero.reverse        //> zipIndex: [A](xs: List[A])List[(A, Int)] 8  9   zipIndex(1 |-> 5)                               //> res5: List[(Int, Int)] = List((1,0), (2,1), (3,2), (4,3), (5,4))

同样，我也可以把返回类型升格成Trampoline：

  1  def zipIndex[A](xs: List[A]): List[(A, Int)] =  2     xs.foldLeft(State.state[Int,List[(A,Int)]](List()))(  3        (acc, a) => for {  4           xn <- acc  5           n <- get[Int]  6           _ <- put[Int](n+1)  7      } yield (a,n) :: xn).lift[Trampoline].evalZero.run.reverse.take(10)  8                                                   //> zipIndex: [A](xs: List[A])List[(A, Int)]  9  10   zipIndex(1 |-> 1000)                            //> res5: List[(Int, Int)] = List((1,0), (2,1), (3,2), (4,3), (5,4), (6,5), (7, 11                                                   //| 6), (8,7), (9,8), (10,9))

看起来可以升格到Trampoline，但实际上还没有解决StackOverflowError问题。这个细节就留在后面我们讨论Free Monad时再研究吧。

作为一种惯例，我们还是看看scalaz提供的用例有什么值得注意的：scalaz-example/StateTUsage.scala

 object StateTUsage extends App {   import StateT._    def f[M[_]: Functor] {     Functor[({type l[a] = StateT[M, Int, a]})#l]   }    def m[M[_]: Monad] {     Applicative[({type l[a] = StateT[M, Int, a]})#l]     Monad[({type l[a] = StateT[M, Int, a]})#l]     MonadState[({type f[s, a] = StateT[M, s, a]})#f, Int]   }    def state() {     val state: State[String, Int] = State((x: String) => (x + 1, 0))     val eval: Int = state.eval("")     state.flatMap(_ => state)   }   }

哇塞！这是什么地干活？我只能无奈的告诉你：其实什么也没干，可以在即时编译器里看看：

  import Scalaz._   import scala.language.higherKinds    def f[M[_]: Functor] {     Functor[({type l[a] = StateT[M, Int, a]})#l]   }                                               //> f: [M[_]](implicit evidence$2: scalaz.Functor[M])Unit    def m[M[_]: Monad] {     Applicative[({type l[a] = StateT[M, Int, a]})#l]     Monad[({type l[a] = StateT[M, Int, a]})#l]     MonadState[({type f[s, a] = StateT[M, s, a]})#f, Int]   }                                               //> m: [M[_]](implicit evidence$3: scalaz.Monad[M])Unit    def state() {     val state: State[String, Int] = State((x: String) => (x + 1, 0))     val eval: Int = state.eval("")     state.flatMap(_ => state)   }                                               //> state: ()Unit    f[List]   m[List]   state

全部返回Unit。我想它只是示范了如何取得一些type class的StateT实例吧。我们知道，获取了一些type class的StateT实例后就可以对StateT施用这些type class的方法函数了。下面是如何获取这些实例以及简单的type class函数引用：

  1 //Functor实例  2  val fs = Functor[({type l[a] = StateT[List, Int, a]})#l]  3                                                   //> fs  : scalaz.Functor[[a]scalaz.IndexedStateT[[+A]List[A],Int,Int,a]] = scala  4                                                   //| z.StateTInstances1$$anon$1@12468a38  5  State[Int,Int] {s => (s+1,s)}                    //> res0: scalaz.State[Int,Int] = scalaz.package$State$$anon$3@1aa7ecca  6  val st = StateT[List, Int, Int](s => List((s,s)))//> st  : scalaz.StateT[List,Int,Int] = scalaz.package$StateT$$anon$1@6572421  7  fs.map(st){a => a + 1}.run(0)                    //> res1: List[(Int, Int)] = List((0,1))  8  //MonadState实例  9  val ms = MonadState[({type f[s, a] = StateT[List, s, a]})#f, Int] 10                                                   //> ms  : scalaz.MonadState[[s, a]scalaz.IndexedStateT[[+A]List[A],s,s,a],Int] = 11                                                   //|  scalaz.StateTInstances1$$anon$1@3c19aaa5 12  ms.state(1).run(0)                               //> res2: List[(Int, Int)] = List((0,1)) 13  //Monad实例 14  val monad = Monad[({type l[a] = StateT[List, Int, a]})#l] 15                                                   //> monad  : scalaz.Monad[[a]scalaz.IndexedStateT[[+A]List[A],Int,Int,a]] = scal 16                                                   //| az.StateTInstances1$$anon$1@689604d9 17  monad.bind(st){a => StateT(a1 => List((a1,a)))}.run(0) 18  //Applicative实例                                  //> res3: List[(Int, Int)] = List((0,0)) 19  val ap = Applicative[({type l[a] = StateT[List, Int, a]})#l] 20                                                   //> ap  : scalaz.Applicative[[a]scalaz.IndexedStateT[[+A]List[A],Int,Int,a]] = s 21                                                   //| calaz.StateTInstances1$$anon$1@18078bef 22  ap.point(0).run(0)                               //> res4: List[(Int, Int)] = List((0,0))

这个state()函数呢？更是摸不着头脑，可能纯是从类型匹配方面示范吧。我们看看它的内里都干了什么：

 1 // def state() { 2  //构建一个State实例。每次它的状态会加个!符号 3     val state: State[String, Int] = State((x: String) => (x + "!", 0)) 4                                                   //> state  : scalaz.State[String,Int] = scalaz.package$State$$anon$3@1e67b872 5  //运算值不变 6     val eval: Int = state.eval("")                //> eval  : Int = 0 7  //连续两次运行状态运算函数。加两个! 8     state.flatMap(_ => state).run("haha")         //> res0: scalaz.Id.Id[(String, Int)] = (haha!!,0) 9  // }

那么StateTUsage.scala里其它例子呢？又离不开什么List,Tree,ADT...，太脱离现实了。还是介绍些实际点的例子吧。最好能把在现实应用中如何选择使用State的思路过程示范一下。曾经看到过一个例子是这样的：查询一个网页的跟帖人信息；维护一个cache，存储5分钟内查过的信息；如果在cache里不存在就从数据库里读取，同时更新cache。我们用伪代码来示范。由于我们选择immutable cache，所以按FP惯用方式传入当前cache，返回新cache：

 trait Cache trait FollowerState def followerState(user: String, cache: Cache): (Cache, FollowerState) = {     val (c1,ofs) = checkCache(user,cache)  //检查cache里有没有user资料                                            //c1是新cache,更新了hit或miss count     ofs match {  //在cache里找到否         case Some(fs) => (c1,fs)  //找到就返回fs和新cache c1         case None => retrieve(user,c1) //找不到就从数据库里重新读取     } } //检查cache，更新cache hit/miss count def checkCache(user: String, cache: Cache): (Cache, Option[FollowerState]) = ... //从数据库读取user资料，更新加入cache def retrieve(user: String, cache: Cache): (Cache, FollowerState) = ...

这个cache不就是一种状态嘛。我们现在需要考虑怎么在上面的函数里使用State Monad来维护这个cache。我们先耍点手段，来点函数款式变形（transformation）：

 def followerState(user: String, cache: Cache): (Cache, FollowerState)  def followerState(user: String)(cache: Cache): (Cache, FollowerState)  def followerState(user: String): Cache => (Cache, FollowerState)

先用curry分开参数，再部分施用（partially apply）就形成了新的函数款式。其它两个函数也一样：

 def checkCache(user: String): Cache => (Cache, Option[FollowerState]) = ... def retrieve(user: String): Cache => (Cache, FollowerState) = ...

现在followerState可以这样写：

 def followerState(user: String): Cache => (Cache, FollowerState) = cache => {     val (c1,ofs) = checkCache(user,cache)     ofs match {         case Some(fs) => (c1,fs)         case None => retrieve(user,c1)     } }

现在这个Cache=>(Cache,FollowerState)不就是一个状态运算函数嘛，Cache是状态，FollowerState是运算值。我们把它包嵌在State内：

 def followerState(user: String): State[Cache,FollowerState] = State {   cache => {       val (c1,ofs) = checkCache(user,cache)       ofs match {           case Some(fs) => (c1,fs)           case None => retrieve(user,c1)       }     } }

如果把其它函数款式也调整过来，都返回State类型：

 def checkCache(user: String): State[Cache,Option[FollowerState]] = ... def retrieve(user: String): State[Cache,FollowerState] = ...

那么我们可以用for-comprehension:

 def followerState(user: String): State[Cache,FollowerState] = for {     optfs <- checkCache(user)     fs <- optfs match {         case Some(fs) => State{ s => (s, fs) }         case None => retrieve(user)     }  } yield fs

程序看来简明很多。我们可以这样调用获取查询结果：

 followerState("Johny Depp").eval(emptyCache)

正文到此结束

所属分类：编程技术

本文标签： 数据库实例 App ip CTO example tab 代码 final 翻译 IDE 锁编译数据参数 zip map list java cat value
版权声明： 本文为互联网转载文章，出处已在文章中说明(部分除外)。如果侵权，请联系本站长删除，谢谢。
本文海报： 生成海报一生成海报二

其他链接

关于本站

本站定位：个人技术类博客

本站作用：写博客、记日志、闲聊扯淡鼓捣技术。

问题交流

Scalaz（17）－ Monad：泛函状态类型－State Monad

热门推荐

相关文章

说给你听

本文目录

随机标签

书籍教程

近期评论

网站信息

其他链接

关于本站

问题交流