Arc Forumnew | comments | leaders | submitlogin
1 point by akkartik 3560 days ago | link | parent

"I've always used use bracket nesting to determine where the syntax's body begins and ends."

I have no idea what you mean by bracket nesting, or by staging levels. It also wasn't clear what the escape sequence is in the third example.



3 points by rocketnia 3560 days ago | link

"bracket nesting"

Whoops, I can't believe I didn't use the phrase "balanced brackets" instead. ^_^

The following two pieces of text may be similar, but I'd give them significantly different behavior as code:

  (foo a b (bar c d) (baz e) f)
  (foo a b bar c d) (baz e f)
My systems don't provide any way to pass the string "a b bar c d) (baz e f" to an operator.

---

"staging levels"

Staged programming is where a program generates some code to run later on, perhaps as a second program in some sense--especially if that boundary is enforced by a need to serialize, transmit, or sandbox the second program rather than executing it here and now. Staged programming has some implications for syntax, since it's valuable to be able to see the code we're generating.

Most languages use " to denote the beginning and end of a string, so they can't also use " to represent the character " inside the string. This can makes it frustrating to nest code within code. I'll use a JavaScript example.

  > eval( "eval( \"1 + 2\" )" )
  3
  > eval( "eval( \"eval( \\\"eval( \\\\\\\"1 + 2\\\\\\\" )\\\" )\" )" )
  3
While all these stages are JavaScript code, they all effectively use different syntax. It's not easy to copy and paste code from one stage to another.

Suppose we identify the end of the string by looking for a matching bracket, possibly with other pairs of matched brackets in between. I'll use ~< and >~ as example string brackets.

  > eval( ~<eval( ~<1 + 2>~ )>~ )
  3
  > eval( ~<eval( ~<eval( ~<eval( ~<1 + 2>~ )>~ )>~ )>~ )
  3
This fixes the issue. The same code is used at every level.

In JavaScript, technically we can implement delimiters like these if we're open-minded about what a delimiter looks like. We just need a function str() that turns a first-class string into a string literal.

  > str( "abc" )
  "\"abc\""

   Open string:  " + str( "
  Close string:  " ) + "

  > eval( "eval( " + str( "1 + 2" ) + " )" )
  3
  > eval( "eval( " + str( "eval( " + str( "eval( " + str( "1 + 2" ) + " )" ) + " )" ) + " )" )
  3
Now the code is consistent! Consistently infuriating. :-p

In Arc, we delimit code using balanced ( ). The code isn't a string this time, but the use of balanced delimiters has the same advantage.

  > (eval '(eval '(eval '(eval '(+ 1 2)))))
  3
This advantage is crucial in Arc, because any code that uses macros already runs across this issue. A macro call takes an s-expression, which contains a macro call, which takes an s-expression....

Since we're now talking about macros that take strings as input, let's see what happens if Arc syntax is based on strings instead of lists.

  > (let total 0 (each x (list 1 2 3) (++ total x)) total)
  6

  > "let total 0 \"each x \\\"list 1 2 3\\\" \\\"++ total x\\\"\" total"
  6
If we use balanced ( ) to delimit strings, we're back where we started, at least as long as we don't look behind the curtain.

  > (let total 0 (each x (list 1 2 3) (++ total x)) total)
  6
If you want working code for a language like this, look no further than Penknife. :)

---

"It also wasn't clear what the escape sequence is in the third example."

Are you talking about this one?

  (re /\>*/)
The original code would be broken in my approach because it uses ) in the middle of the regex, so the macro's input would stop at "/\". This fix addresses the issue by using a hypothetical escape sequence \> to match a right parenthesis, rather than using the standard escape sequence \).

If you're talking about my Penknife code sample, the "qq." part is quasiquote, and the \, part is unquote. Quasiquotation is relatively complicated here due to the fact that it generates soup, which is like a string with little pieces floating in it. :-p Penknife has no s-expressions, so it was either this foundational kludge or the hard-to-read use of manual AST constructors.

It's hard to count these examples with a whole number. XD Let me know if you were talking about my Blade code sample (the third code block in the post) or my Jisp code sample (third if you count the two example regex fixes separately).

-----

1 point by akkartik 3559 days ago | link

Super useful, thanks. Yes, you correctly picked the code I was referring to :)

That issue with backslashes you're referring to, I've heard it called leaning toothpick syndrome.

-----

2 points by rocketnia 3559 days ago | link

"Yes, you correctly picked the code I was referring to :) "

Which of my guesses was correct? :-p

---

"That issue with backslashes you're referring to, I've heard it called leaning toothpick syndrome."

Nice, I hadn't heard of that! http://en.wikipedia.org/wiki/Leaning_toothpick_syndrome

It might be worth pointing out that the LTS appearing in my examples is more pronounced than it needs to be. The usual escape sequence \\ for backslashes creates ridiculous results like \\\\\\\". If we use \- to escape \ we get the more reasonable \--" instead, and then we can see the nonuniform nesting problem without that other distraction:

  > eval( "eval( \"1 + 2\" )" )
  3
  > eval( "eval( \"eval( \-"eval( \--"1 + 2\--" )\-" )\" )" )
  3
Here's another time I talked about this: http://arclanguage.org/item?id=14915

-----