Skip to content

Parsley and dynamic parsers based on the state

In this blog post we will make some improvements to our indent parser. In particular we want to remove the unsettling ???. We will meet some difficulties along the way and we will see a solution to it 😃

Naive attempt

To remind ourselves what the problem was, let's take a look at this:

scala
// It's a little bit different than the _exact_ code from last time
// but please bear with me
val scope: Parsley[Scope] = many(
    state.update((
        many(' ').map(_.length) <~> stmt <~ '\n'
    ).map { res => (s: State) =>
        val sc = s.scope
        val (i, sm) = res
        if i == s.level then
            s.push(sm)
        else if i == s.level + 4 then
            s.newScope(sm)
        else if i == s.level - 4 then
            s.prevScope(sm)
        else
            ??? // or fail()
        s
    }
)) ~> state.get.map { s => s.scope.last }

It works perfectly... when given valid input. It's easy enough to see that if we went into the last case, instead of returning a parsley Failure, we would just throw our hands in the air and give up. Remember that if parsley fails to parse the input, it would do nice error handling for us, so what we really want is probably:

scala
def spaces(x: Int) = string(" ".repeat(x))
val scope: Parsley[Scope] = many(
    state.update((
        (spaces(???) <|> spaces(??? + 4) <|> spaces(??? - 4))
        <~> stmt <~ '\n'
    ).map { res => (s: State) =>
        val sc = s.scope
        val (i, sm) = res
        if i == s.level then
            s.push(sm)
        else if i == s.level + 4 then
            s.newScope(sm)
        else if i == s.level - 4 then
            s.prevScope(sm)
        else
            ??? // safe, never reachable
        s
    }
)) ~> state.get.map { s => s.scope.last }

??? depends on our current indent. A natural thought would be using state.gets, so we can access the indent in a limited scope:

scala
val scope: Parsley[Scope] = many(
    state.update((
        state.gets { s =>
            spaces(s.level) <|> spaces(s.level + 4) <|> spaces(s.level - 4)
        }
        <~> stmt <~ '\n'
    ).map { ... } // omitted
)) ~> ... // omitted

Ah, but the typing wouldn't work out, def gets[B](f: State => B): Parsley[B], we returned a Parsley[String], and Parsley[Parsley[String]] is pretty garbage.

So... Apparently .map and spaces has to be both inside the gets call:

scala
val scope: Parsley[Scope] = many(
    state.update((
        state.gets { s =>
            (spaces(s.level) <|> spaces(s.level + 4) <|> spaces(s.level - 4))
            <~> stmt <~ '\n'
        ).map { ... }
    }
)) ~> ...

No, state.update expects a State, but we are now offering Parsley[State].

If you try some other combinations, like putting update inside gets, you will see that the types would prevent you as well (in this case, update is still given a Parsley[State]).

quote

The reason monads are not supported is because of the Staging: Parsley parsers are compiled ahead of time to produce fast code, but this means the entirely of the parser must be known before any input is provided, ruling out dynamic monadic operations.

OK, this is from parsley haskell, so it might not be true in parsley, and I did not present any proof that you cannot create the parser that we wanted using update and gets. \_ ツ _/

What I can say is that it's not idiomatic, and probably can't be optimized by parsley internally. And I do believe that parsley's type system will prevent you from creating a dynamic parser this way.

But this kind of parser is really easy to write if you go imperative, you get the read from the global state and you can do whatever with it. Do we have to lose this power for the functional interface?

The solution

Well, it turns out the builtin combinator forP fits this use case perfectly.

scala
val scope: Parsley[Scope] = many(
    state.update((
        ((atomic(
            forP[Int](state.gets(_.level + 4), pure(_ > 0), pure(_ - 1))
        ) <|>
        atomic(
          forP[Int](state.gets(_.level), pure(_ > 0), pure(_ - 1))
        ) <|>
        atmoic(
         forP[Int](state.gets(_.level - 4), pure(_ > 0), pure(_ - 1))
        ))
        <~> stmt <~ '\n'
        ).map { ... }
    }
)) ~> ...

So how was it implemented? Is it just that I am lame?

scala
// forP => forP_
def forP_[A](...)(body: Parsley[A] => Parsley[_]): Parsley[Unit] = {
    init.fillRef { ref =>
      lazy val _cond = ref.gets(cond)
      lazy val _step = ref.update(step)
      whenS(_cond, whileS(body(ref.get) *> _step *> _cond))
    }
}
// whenS
def whenS(condP: Parsley[Boolean], thenP: =>Parsley[Unit]): Parsley[Unit] = ifS(condP, thenP, unit)
// ifS
def ifS[A](condP: Parsley[Boolean], thenP: =>Parsley[A], elseP: =>Parsley[A]): Parsley[A] = {
    new Parsley(new frontend.If(condP.internal, thenP.internal, elseP.internal))
}

So that injects an inner instruction, I consider that cheating 😃

Oh well this chapter seems disappointingly short.

Afterthoughts

I wanted to write this post not to show how to do x like in the last one, it's entirely possible that we can implement forP with some kind of recursion with parsley api, and our solution to the initial problem seems pretty superficial: here use this handy api.

It just occurred to me that the fact that the type system prevented me to just hack is intriguing. We all know in theory that types => constraints => finds bugs => good, but this demo in particular is a great lesson for me (hopefully for you too!) how good api guides you.

I probably wouldn't say that had forP doesn't solve my problem perfectly, damn

OK, that's all for this time.