Arc Forum | I would like to point out that although it's a Pratt parser, it's been specifica...

Arc Forum

2 points by Pauan 4409 days ago | link | parent

I would like to point out that although it's a Pratt parser, it's been specifically modified to work better with Lisps, meaning it operates on lists of symbols rather than on a single expression. I have not seen another parser like it.

This makes it much more powerful while also being much easier to use. Using the Nulan parser, adding in new syntax is as easy as writing a macro!

For instance, in Nulan, the "->" syntax is used for functions:

  foo 1 2 3 -> a b c
    a + b + c

The above is equivalent to this Arc code:

  (foo 1 2 3 (fn (a b c)
    (+ a b c)))

And here's how you can implement the "->" syntax in Nulan:

  $syntax-rule "->" [
    priority 10
    order "right"
    parse -> l s {@args body}
      ',@l (s args body)
  ]

As you can see, it's very short, though it might seem cryptic if you don't understand Nulan. Translating it into Arc, it might look like this:

  (syntax-rule "->"
    priority 10
    order "right"
    parse (fn (l s r)
            (with (args (cut r 0 -1)
                   body (last r))
              `(,@l (,s ,args ,body)))))

The way that it works is, the parser starts with a flat list of symbols. It then traverses the list looking for symbols that have a "parse" function.

It then calls the "parse" function with three arguments: everything to the left of the symbol, the symbol, and everything to the right of the symbol. It then continues parsing with the list that the function returns.

So basically the parser is all about manipulating a list of symbols, which is uh... pretty much exactly what macros do.

Going back to the first example, the "parse" function for the "->" syntax would be called with these three arguments:

  1  (foo 1 2 3)
  2  ->
  3  (a b c (a + b + c))

It then destructures the 3rd argument so that everything but the last element is in the variable "args", and the last element is in the variable "body":

  args  (a b c)
  body  (a + b + c)

It then generates the list using "quote", which is then returned.

Basically, it transforms this:

  foo 1 2 3 -> a b c (a + b + c)

Into this:

  foo 1 2 3 (-> (a b c) (a + b + c))

As another example, this implements Arc's ":" ssyntax, but at the parser level:

  $syntax-rule ":" [
    priority 100
    order "right"
    delimiter %t
    parse -> l _ r
      ',@l r
  ]

So now this code here:

  foo:bar:qux 1 2 3

Will get transformed into this code here:

  foo (bar (qux 1 2 3))

I've never seen another syntax system that's as easy and as powerful as this.

Oh yeah, and there's also two convenience macros:

  $syntax-unary "foo" 20
  $syntax-infix "bar" 10

The above defines "foo" to be a unary operator with priority 20, and "bar" to be an infix operator with priority 10.

Which basically means that...

  1 2 foo 3 4  =>  1 2 (foo 3) 4
  1 2 bar 3 4  =>  1 (bar 2 3) 4

Here's a more in-depth explanation of the parser:

https://github.com/Pauan/nulan/blob/780a8f46cb4ff90e849c03ea...

Nulan's parser is powerful enough that almost all of Nulan's syntax can be written in Nulan itself. The only thing that can't be is significant whitespace.

Even the string syntax (using "), whitespace ( ), and the various braces ({[]}) can be changed from within Nulan.