Arc Forum | Sorry in advance for getting carried away....---"I opted for the "var" notation ...

Arc Forum

2 points by rocketnia 5060 days ago | link | parent

Sorry in advance for getting carried away....

---

"I opted for the "var" notation because I wanted to emphasize that the bindings are to _variables_; not to _values_ as is the case in (some) other lisps. I wanted to leave "let" for the case when you really want a constant."

In that case, why not have let but call it "var"? :-p

It's not that there aren't good reasons to go your way, it's that I don't know them! If it helps, Racket's another lisp where programmers are encouraged to use internal (define ...) forms when possible rather than let forms.

My understanding fails in the same way when it comes to your treatment of looping. In the article you mentioned having a for loop as a primitive, and I'm fine with that up to the point that I understand defining it in terms of lambda would raise issues regarding the JVM's lack of tail calls. But why is 'for a primitive, when 'while can be the primitive and 'for can be a macro that expands to it?

You also said you wanted to give 'for support for break and continue. If you do, I kinda recommend supporting labeled break and continue (like Java does) so that they don't arbitrarily go out of scope in inner loops. But then, with labeled break and continue... why not just have 'goto as the primitive?

Er, I'll table that suggestion for now. ^^; Maybe the primitive could be an expression that supports nonlocal return from within its body. Something like Arc's 'point:

  (fn ()
    (+ "I "
       (point my-return
         (while foo
           (when pop.foo
             (my-return "found it!")))
         "didn't find it!")))
  
  
  ; Translates to JavaScript something like this:
  ; (The Java output would be slightly more verbose.)
  
  (function () {
      var temp_0 = "I ";
      var temp_1;
      point_1: while ( true ) {
          while ( truthy( v_foo ) ) {
              if ( truthy( v_pop( v_foo ) ) ) {
                  temp_1 = "found it!";
                  break point_1;
              }
         }
         temp_1 = "didn't find it!";
         break;
     }
     return plus( temp_0, temp_1 );
  })

In Arc, 'my-return is a continuation. For bullet purposes, I'm thinking of it as a locally scoped special form or macro. If that's too out-of-this-world for anyone, the break syntax could just be (goto my-return "found it!") instead, with 'goto being an additional primitive dedicated to breaking. That's right, I've actually been talking about a form of 'goto this whole time.

With 'while and 'point in the language, 'for could be implemented like so:

  (mac for (init condition step . body)
    `(do ,init
         (point break         ; anaphoric
           (while ,condition
             (point continue  ; anaphoric
               ,@body
               ,step)))))

But 'while doesn't need to be a primitive either. The full 'while can be defined in terms of (while t ...), which I call 'forever:

  (mac while (condition body)
    (w/uniq break
      `(point ,break
         (forever
           (unless ,condition
             (,break nil))     ; or (goto ,break nil)
           ,@body))))

In fact, both 'point and 'forever can be defined in terms of a single axiom, if it comes right down to it:

  (mac point (break . body)
    `(pointforever ,break
       (,break (do ,@body))))  ; or (goto ,break (do ,@body))
  
  (mac forever body
    (w/uniq break
      `(pointforever ,break
         (do ,@body))))

By the way, everyone feel free to use any of the code in this post for yourself, lol.

1 point by seertaak 5060 days ago | link

> But why is 'for a primitive, when 'while can be the primitive and 'for can be a macro that expands to it?

For no better reason than I wrote the 'for primitive before I added macros!

I agree about labeled break and continue -- I specifically mention them in the article BTW. And to be honest, it has also occurred to me that this is just a local goto. So I have contemplating adding them.

The "point" approach is very cool. I like that it can translates into JVM features directly while remaining quite general (e.g. it's trivial to implement "return" using point). I think it's pretty easy to implement in the interpreter also, using exceptions (it's not going to win any speed contests), though you'd use labels and gotos when compiling.

Thanks for offering use of the code, I'm going to try it out. When I've got something worth reporting on I'll do so :)

-----

4 points by rocketnia 5059 days ago | link

You haven't responded to the "why not just have let but call it 'var'" part, so I'd like to get another idea out of the way. It's actually a reason I might like having 'var around.

For some background... in JavaScript, every variable declaration's scope is the full range of the function () { ... } block surrounding it. For instance, you can use variables "before" you declare them (in which case they're a default value, undefined). In certain cases this is pretty annoying, since it means a var inside a loop actually declares one variable that's in scope for all iterations, rather than a variable per iteration, which matters for closures that capture the variable. I just mention this to clarify what's going on.

Suppose there were a special form (varscope <var> <...body...>) which scoped (<var> <name> <val>) as a special form in the body, which both defines and assigns to a variable and returns the new value. One can say (var foo foo) to define a variable without modifying it.

(As with 'point, I'm using a locally scoped label since I have no intention for nesting to put the var form out of scope. Just as with 'point, instead of having a locally scoped special form, we could have two special forms (varscope <env-name> <...body...>) and (var <env-name> <name> <val>).)

With this kind of var form, we have the ability to define a variable closer to where it actually matters to our code, even when we want its scope to be farther away:

  (varscope var
    (var gui (window "Alert!"
               (var cries (label "I'm a dialog. Log! Dialog-log!"))
               (var ok (button "Okay..."))
    (on click ok
      (if okay-already
        dismiss.gui
        (do (var okay-already t)
            (set-text cries "Dialog? Dial? Diiial!"))))
    show.gui)
  
  ==>
  
  (with (gui nil cries nil ok nil okay-already nil)
    (= gui (window "Alert!"
             (= cries (label "I'm a dialog. Log! Dialog-log!"))
             (= ok (button "Okay..."))
    (on click ok
      (if okay-already
        dismiss.gui
        (do (= okay-already t)
            (set-text cries "Dialog? Dial? Diiial!"))))
    show.gui)

In a language with unobtrusive table lookup syntax, like Arc or JavaScript, programmers can resort to tables to accomplish the same kind of expressiveness, probably at the cost of static analyzability:

  (let o (obj)
    (= o!gui (window "Alert!"
               (= o!cries (label "I'm a dialog. Log! Dialog-log!"))
               (= o!ok (button "Okay..."))
    (on click o!ok
      (if o!okay-already
        (dismiss o!gui)
        (do (= o!okay-already t)
            (set-text o!cries "Dialog? Dial? Diiial!"))))
    (show o!gui))

-----

2 points by seertaak 5059 days ago | link

When I first read your earlier question, I thought the question was about the names rather than how you specify the scoping. In any case, I opted for a binding declaration ("var") which consists of a single or series of bindings, rather than the "let" alternative where the bindings are over an explicitly specified body. To my mind, the former approach is simpler, and avoids extra nesting merely to define a new set of variables. It's the approach of e.g. arc and python. I don't see the gain of the varscope concept over just having "var" and "set" operate on the scope that surround the declarations and child scopes. Maybe I'm not understanding something? For example, in the situation that you describe, where you want to define a variable over a "large" scope, but don't actually know what to bind it to until later: what's wrong with just setting the variable to null until you know what its true value should be?

Here's how bullet handles things:

* A variable comes into existence either through a "var" or a "set" declaration. It is an error to have consecutive "var" calls in the same scope. Assignments are much more permissive: obviously you can use any number of them in any scope, and you don't need a "var" to preceed the first instance of a "set".

* If you write "var x ..." in a scope, and again in a child scope, then there is a new binding for x in the child scope that shadows the binding in the parent scope. On the other hand, "set x ..." not create a new binding if one already exists, so it will set x in the parent scope.

* All of this is handled by Environment class, using a rather pedestrian HashMap<Symbol, Variable>, where Variable is nothing more than a box. Obviously we capture variables that escape from functions when we analyze lambda definitions.

* The Environment functionality is reused to provide python-like modules. There is a special evaluation syntax (basically, it's the same syntax as for array/map access in clojure), which coexists quite nicely with the "." syntax:

    set y 5
    var x 42

    module m
      var x 1
      func dblx ()
        set x: + y: * x 2
      func printX ()
        print "x = " x

    print "x = " x
    print "m.dblx = ": m.dblx
    m.printX
    set m.x 50
    print "m.x = " m.x

    // prints:
    // x = 42
    // m.dblx = 7
    // x = 7
    // m.x = 50

An example of its use is in the bullet.bt, where I define a mini module for string functions:

    print: str.joinOn " " "modules" "are" "cool"

By the way, I toyed with removing "var" altogether, but I think I want to keep it; it'll come in handy when I build in an object system. Basically, within an object definition, variables will be member variables and you'll be able to refer to them within member functions ("mfunc" or "meth", not sure yet) either through "this.varname" or just "varname".

Also, in bullet "with" resembles pascal's "with" except that it returns the (effected upon) object:

    print: with (new ArrayList)
      add 1
      add "foo"
      ...

I'm not sure if I've answered your question. Thanks again for your input, it's awesome getting feedback on design decisions (there are so many decisions to make!).

EDIT: Added an example of "nested" set.

-----

2 points by rocketnia 5059 days ago | link

"and avoids extra nesting merely to define a new set of variables. It's the approach of e.g. arc and python."

Arc uses an extra level of nesting, just like most Schemes and MLs do. Arc's let is different from Scheme's let, but only because it supports destructuring, does only one binding, and uses fewer parentheses.

As for Python's approach where all plain variable assignments define locals except when declared otherwise[1], it rubs me the wrong way. I'm not sure why. (Side note: Now that I've looked up Python scope, it sounds an awful lot like Kernel's mutation policy, where nonlocal variables can't be rebound without using a captured environment. I'm hopeful that this kind of scoping can make fexpr code more efficient, so... it might have similar ramifications for interpreted Python...? Fexprs in Python, scary.)

[1] http://stackoverflow.com/questions/7935966/python-overwritin...

---

"I don't see the gain of the varscope concept over just having "var" and "set" operate on the scope that surround the declarations and child scopes. [...] what's wrong with just setting the variable to null [in the outer scope] until you know what its true value should be?"

My one reason for liking var is that you don't have to trek up to the top of a variable's scope to define it. If you do still have to trek outside a child varscope, that's not a full expression of the feature.

In practice we might use the same name for every varscope, having decided that shadowing is an acceptable compromise for being able to copy a variable declaration from one part of the code to another. Given that we're choosing the same names everywhere, we might even have macros that choose them for us.

---

"It is an error to have consecutive "var" calls in the same scope."

I assume you mean multiple vars of the same name in the same scope. Otherwise I'd expect your "module m" example to fall apart. (If func doesn't expand into var, I'd rather it did.)

Whether or not that's what you mean, I don't like that error. In JavaScript, while multiple declarations of a single variable are allowed and commonly discouraged, I occasionally find myself preferring to have two "constant" variables that just so happen to have the same name. It can help emphasize that the code that uses them is almost exactly the same. (What? JS has no macros. :-p )

One consistent example of my rebellion is with loops, where I have no problem using "for ( var i = ..." for multiple loops in a single function.

---

"The Environment functionality is reused to provide python-like modules."

I like it, but I have more extreme recommendations.

I believe it's possible to implement (varscope ...) in a Scheme that has manual access to the compiler. In that language, (varscope foo ...) would compile its body in a local environment where 'foo is a macro that expands to '= and pushes the variable name to a list, and then it would embed the compiled body in a (let ...) form that bound all the listed variables to nil. A similar technique would work to implement your 'module form. (Manual access to the compiler might not be necessary if the Scheme has a suitable concept of locally scoped macros instead.)

While this in itself is nice, it would be cleaner to be able to do this without using mutation. This could be easily accomplished if there were a special compiler utility that took a tuple (varscope-label, how-to-expand-the-varscope-label, code) and returned a tuple (compiled-code, set-of-variable-names).

Keep in mind that by "compiler" I mean whatever handles the phase that expands macros. If you don't expand macros until execution time (in your "interpreter" perhaps), whoops, you've got fexprs. :-p

-----

1 point by seertaak 5059 days ago | link

> My one reason for liking var is that you don't have to trek up to the top of a variable's scope to define it. If you do still have to trek outside a child varscope, that's not a full expression of the feature.

I'm afraid I don't understand what you mean by this. Maybe you could give a short example? As I see it, one way or another you need to mark out the "outer" scope, where you declare the lifetime of the variable (even though you don't know its value yet). Then somewhere in two or more branches of that scope the variable is set and used. IIUC you propose to write "varscope" at the top, and "var" further down. Doesn't that mean that whenever you define a variable, you need to write "varscope" before it? I somehow can't believe that, which makes me think I'm still not grokking!

> I assume you mean multiple vars of the same name in the same scope.

Yes, otherwise you quickly run in trouble with parent-scope scheme :)

> One consistent example of my rebellion is with loops, where I have no problem using "for ( var i = ..." for multiple loops in a single function.

The way that's handled in bullet is that when we encounter the for, we push a new environment, run the initialization, then push (and subsequently pop -- the interpreter holds a stack of environments representing runtime frames) an environment for each iteration of the loop. So it's ok to use the same variable name in subsequent for loops: the previous instance is not alive by the time current is reached.

> If func doesn't expand into var, I'd rather it did.

Here's the definition of "func" in bullet:

    macro func (name args :rest exprs)
      qquote
        set ,name
          fn ,args ,@exprs

So, yes, it's just a var binding. Note that this allows definition of "module" functions.

    func m.foo (): print "m.foo" 
    ==>
    set m.foo: fn (): print "m.foo"

By the way, even macros are defined as a macro:

    var macro
      tfm (name args :rest body)
        qquote: set ,name: tfm ,args ,@body

Macro's don't have any functionality analogous to Lambdas to capture variables from enclosing scopes.

As it should be clear by now, my implementation is a sort of illegitimate child of fexprs and macros. Basically, I've introduced all the weaknesses of fexprs in return for only some of the gains :)

In bullet, macros are values represented by the Transform class. Their definition is almost identical to Lambdas: they hold their formal parameters and their body as an AST. The only difference to Lambdas is how they're treated for evaluation purposes by the interpreter. Transforms (like Primitives which are basically fsubrs) receive their operands unevaluated. They are expanded at runtime, and the result of the expansion is then evaluated in the lexical environment of the call site. That means you can use macros in higher-order functions; they truly are first class.

Until now, this was an artifact of my interpreter design (emphasizing getting something up and running quickly). My intention had been to double back and fix the discrepancy with "real" lisps by doing the standard initial macroexpand traversal of the AST before evaluating. Either that or ditch the interpreter and write a compiler.

However, the material you've presented me with regarding fexprs is truly fascinating (no, I didn't know what they were before I read your post). I've just got to try these fexpr style macros; the idea of just controlling evaluation of operands, but otherwise being just like a regular function is very appealing.

In conclusion, I would appreciate if you could explain the varscope concept further. Again, what I don't get is whether need to write "varscope.." before you can bind a variable using "var..". I bang on about that because I'm loathe to introduce a feature that introduces such a high overhead for a single variable use. Or do you also have "lets" that work as in scheme?

Also, you kind of lose me in the last two paragraphs. It would be help if you could in some sense "sell" your concept it to me (please!): what extra bit of power is now available, that I can't express in my implementation? Maybe by the extra-cool use, I will understand the tradeoff in terms of extra typing for a variable declaration.

-----

3 points by rocketnia 5058 days ago | link

Turns out 'varscope has an inconsistent corner case, the way I was originally thinking about it.

  (mac foo () "macro")
  (mac id-mac (x) x)
  (varscope v
    (id-mac (v foo (fn () "function")))
    (foo))

Should this result in "macro" or "function"? What if we change it up like this?

  (mac foo () "macro")
  (mac id-mac (x) x)
  (varscope v
    (id-mac (v id-mac (fn (x) nil)))
    (id-mac (v foo (fn () "function")))
    (foo))

I'd rather have 'varscope work in a compilation phase (no dependence on fexprs), and I'd rather not make the order of compilation matter, so I'm going to make a very hackish decision: The body of a (varscope ...) form should be compiled as though it put no variables in scope. Macros from the surrounding scope will work even if the local scope shadows them at run time.

By no coincidence, this design compromise is compatible with the hypothetical implementation below. I actually only realized this flaw once I was documenting that implementation. :-p

(By complete coincidence(?), this is similar to Arc 3.1's bug where local variables don't shadow macros. In the hypothetical language(s) I'm talking about, function parameters and let-bound variables would hopefully still shadow macros, so it wouldn't be quite the same.)

---

"IIUC you propose to write "varscope" at the top, and "var" further down. Doesn't that mean that whenever you define a variable, you need to write "varscope" before it?"

Close. You can have more than one variable per varscope. But I'm guessing you knew that. :-p

That means when you define a variable, you don't necessarily need to define a varscope if a suitable one already exists. But I don't expect even a single 'varscope to appear very often in code; instead I expect convenience macros to take care of it.

---

"The way that's handled in bullet ... it's ok to use the same variable name in subsequent for loops: the previous instance is not alive by the time current is reached."

That's a good example of when a macro could take care of establishing a varscope.

  (for <init> <condition> <step>
    <...body...>)
  ==>
  (varscope var
    <init>
    (while <condition>
      <...body...>
      <step>))

For the analogous case in JS (or rather a hypothetical JS-like language whose semantics are based on 'varscope), the only thing that establishes a varscope is the "function () {}" syntax. Sibling loops of the form "for ( var i = ..." use the same i because they don't establish a new scope for themselves.

---

"Or do you also have "lets" that work as in scheme?"

Yes, I would have them both. It's hard not to have 'let since it can just be defined as a macro over 'fn. Whether one would be emphasized over the other, I'm not sure.

---

"Also, you kind of lose me in the last two paragraphs."

I was talking about a compiler for varscope bodies. It'll help to back up a bit....

In a language where macros return compiled code rather than code to compile, traditional macros are simple to implement as sugar:

  (mac when (condition . body)
    `(if ,condition ,@body))
  ==>
  (def-syntax when (condition . body) gensym123_static-env
    (compile `(if ,condition (do ,@body)) gensym123_static-env))

If you find this shockingly similar to Kernel-style fexprs, hey, me too. :-p

  (mac when (condition . body)
    `(if ,condition ,@body))
  ==>
  (def-fexpr when (condition . body) gensym123_dynamic-env
    (eval `(if ,condition (do ,@body)) gensym123_dynamic-env))

IMO, the compile phase is just an fexpr eval phase whose result is used as code. Arc macros, which don't have access to the environment, are limited in the same way as pre-Kernel fexprs.

So what I'm suggesting is that in addition to 'compile, we have a second compilation function that lets us compile the body of a (varscope ...) or (module ...) form.

In fact, here's exactly how I'd use it. I'll call it 'compile-w/vars.

  ; I'm assuming 'compile-w/vars, 'def-syntax, and 'compile exist in
  ; Arc.
  ;
  ; I'm also assuming 'mc exists in Arc as an anonymous macro syntax (so
  ; that 'mc is to 'mac as 'fn is to 'def).
  ;
  ; Last but not least, I'm assuming (nocompile <code>) exists in Arc as
  ; a way to embed compiled code inside uncompiled code. When
  ; (nocompile <code>) is compiled in a static environment <env>, it
  ; should associate any free variables in <code> with variables bound
  ; in <env>. To make this happen, both 'compile-w/vars and 'compile
  ; should accept code even if it has free variables, and compiled code
  ; should be internally managed in a format that allows for this kind
  ; of augmentation.
  
  (def-syntax varscope (label . body) env
    ; In case you're not familiar, this is a destructuring use of 'let.
    (let (new-body vars)
           ; NOTE: We compile the body in the *outer* environment, not
           ; the local environment the varscope establishes.
           (compile-w/vars
             label (mc (var val)
                     `(= ,var ,val))
             `(do ,@body) env)
      (make-compiled-let (map [list _ nil] vars)

      (compile `(with ,(mappend [do `(,_ nil)] vars)
                  (nocompile ,new-body))
               env)))
  
  (def-syntax anon-module-w/var (label . body) env
    (w/uniq g-table
      (let (new-body vars)
             ; NOTE: We compile the body in the *outer* environment, not
             ; the local environment the module establishes.
             (compile-w/vars
               label (mc (var val)
                       ; Set both a variable and a table entry.
                       `(= (,g-table ',var) (= ,var ,val)))
               `(do ,@body) env)
        (compile `(with (,g-table (obj) ,@(mappend [do `(,_ nil)] vars))
                    (nocompile ,new-body)
                    ,g-table)
                 env))))
  
  ; This anaphorically binds 'var as the module's variable declaration
  ; form.
  (mac module (name . body)
    `(= ,name (anon-module-w/var var ,@body)))

As before, I release this code for anyone to use--or rather to derive actual working code from. :-p No need for credit.

---

"So, yes, it's just a var binding."

That's a var binding even though it uses 'set? I'm confused.

---

"no, I didn't know what [fexprs] were before I read your post"

What post is that?

I had a half-written reply that started with "Come to think of it, you probably do have fexprs," and went on to explain why I suspected it, what they were, and what you might get if you embraced it or rejected it. Should I still post it? It sounds like you understand it already, but it wouldn't do to have a time paradox. :-p

Anyway, since you're an fexpr fan now, I would like to emphasize the other side: The translation of my (point ...) example into imperative code is straightforward to do during a compilation phase, and fexprs get in the way of compilation phases. :)

It may be possible to force one's way through fexprs during a compilation phase too, but I expect that algorithm to look like a cross between a) constant-folding and b) static type inference with dependent types (since eval's return type depends on the input values). Rather than simply using recursion to compile subexpressions, the algorithm would do something more like using recursion together with concurrency, so that some subgoals could wait for information from other subgoals. To complicate things further, if the program uses a lot of mutable variables, the algorithm might not be able to treat them as constants, and it might not get very far unless you run it at run time as a kind of JIT.

I find this pretty intimidating myself. I've made steps toward at least the constant-folding part of this (which I expect will be sufficient for almost all fexpr programs written in a reasonable style), but I've gotten bogged down in not only the difficulty but also my own apathy about fexprs.

-----

1 point by Pauan 5058 days ago | link

"The translation of my (point ...) example into imperative code is straightforward to do during a compilation phase, and fexprs get in the way of compilation phases. :)"

Why not have both? As in, have a way of saying "this should all be done at compile-time" that doesn't involve fexprs at all, or involves a special variant of fexprs. Reminds me of a discussion somewhere about micros (as opposed to macros)...

-----

3 points by rocketnia 5058 days ago | link

"Reminds me of a discussion somewhere about micros (as opposed to macros)..."

This probably isn't what you mean, but...

One of my oldest toy languages (Jisp) had syntactic abstractions I called "micros," and I discuss them here: http://arclanguage.org/item?id=10719

tl;dr: My micros are fexprs that not only leave their arguments unevaluated but also leave them unparsed. The input to a micro is a string.

Much like how I just said macroexpansion in the compilation phase was like fexpr evaluation, what I've pursued with Penknife and Chops is like a compilation phase based on micro evaluation.

---

"Why not have both? As in, have a way of saying "this should all be done at compile-time" that doesn't involve fexprs at all, or involves a special variant of fexprs."

One way to have both is to have two fexpr evaluation phases, one of which we call the compile phase. This is staged programming straightforwardly applied to an fexpr language... and it's as easy as wrapping every top-level expression in (eval ... (current-environment)).

However, that means explicitly building all the code. If you want to call foo at the repl, you can't just say (foo a b c), you have to say (list foo a b c).

With quasiquote it's much easier for the code that builds the code to look readable. So suppose the REPL automatically wraps all your code in (eval `... (current-environment)). Entering (foo a b c) will do the expected thing, and we can say (foo a ,(bar q) c) if we want (bar q) to evaluate at compile time.

Now let's fall into an abyss. Say the REPL automatically detects the number of unquote levels we use in a command, and for each level, it wraps our code in (eval `... (current-environment)) to balance it. Now (foo a b c) will do the expected thing because it's wrapped 0 times, (foo a ,(bar q) c) will do the expected thing because it's wrapped once, and so on. We have as many compile phases as we need.

The price is one reserved word: unquote. This would be the one "special variant of fexprs."

-----

1 point by rocketnia 5056 days ago | link

"The price is one reserved word: unquote. This would be the one "special variant of fexprs.""

Possible correction: If any kind of 'quasiquote is ever going to be in the language, it should probably have special treatment so that its own unquotes nest properly with the REPL's meaning of unquote. An alternative is to use a different syntax for non-REPL unquotes (e.g. ~ instead of ,).

Also note that ,foo could be a built-in syntax that doesn't desugar to anything at all (not even using the "unquote" name), instead just causing phase separation in a way that's easy to explain to people who understand 'quasiquote.

-----

1 point by Pauan 5056 days ago | link

"Also note that ,foo could be a built-in syntax that doesn't desugar to anything at all (not even using the "unquote" name), instead just causing phase separation in a way that's easy to explain to people who understand 'quasiquote."

Yes, I currently think that all syntax should be at the reader level, rather than trying to use macros to define syntax. As an example of what I'm talking about, in Nu, [a b c] expands into (square-brackets a b c) letting you easily change the meaning of [...] by redefining the square-brackets macro.

Or the fact that 'foo expands into (quote foo) letting you change the meaning of the quote operator... or the fact that `(foo ,bar) expands into (quasiquote (foo (unquote bar))), etc.

I used to think that was great: hey look I can easily change the meaning of the square bracket syntax! But now I think it's bad. I have both conceptual and practical reasons for thinking this.

---

I'll start with the conceptual problems. In Lisp, there's essentially three major "phases": read-time, compile-time, and run-time. At read-time Lisp will take a stream of characters and convert it into a data structure (often a cons cell or symbol), compile-time is where macros live, and run-time is where eval happens.

Okay, so, when people try to treat macros as the same as functions, it causes problems because they operate at different phase levels, and I think the same exact thing happens when you try to mix read-time and compile-time phases.

---

To discuss those problems, let's talk about practicality. quasiquote in particular is egregiously bad, so I'll be focusing primarily on it, though quasisyntax also suffers from the exact same problems. Consider this:

  `(,foo . ,bar)

You would expect that to be the same as (cons foo bar), but instead it's equivalent to (list foo 'unquote 'bar). And here's why. The above expression is changed into the following at read-time:

  (quasiquote ((unquote foo) . (unquote bar)))

And as you should know, the . indicates a cons cell, which means that the above is equivalent to this:

  (quasiquote ((unquote foo) unquote bar))

Oops. This has caused practical problems for me when writing macros in Arc.

---

Another problem with this approach is that you're hardcoding symbols, which is inherently unhygienic and creates inconsistent situations that can trip up programmers. Consider this:

  `(,foo (unquote ,bar))

You might expect that to result in the list (list foo (list 'unquote bar)) but instead it results in the list (list foo bar), because the symbol unquote is hardcoded.

---

Yet another problem is that it requires you to memorize all the hard-coded names for all the syntax. You have to remember to never define a function/macro called quote. To never define a function/macro called unquote, to never define a function/macro called square-brackets, etc... which means this will break:

  ; oops, redefined the meaning of the quote syntax
  (let quote ...
    'foo)

When the number of syntax is small, that's not really a high price to pay, but it is still a price.

---

Also, this whole "read syntax expands into macros" thing is also inconsistent with other syntax. For instance, (1 2 3) isn't expanded by the reader into (list 1 2 3). That is, if you redefine the list function, the meaning of the syntax (1 2 3) doesn't change. But if you redefine the quote macro, then suddenly the syntax 'foo is different.

The same goes for strings. Arc doesn't expand "foo" into (string #\f #\o #\o) either. So redefining the string function doesn't change the meaning of the string syntax. So why are we doing this for only some syntax but not others?

---

All of the above problems go away completely when you just realize that read-time is a separate phase from compile-time. So if you want to change the meaning of the syntax 'foo the solution isn't to redefine the quote macro. The solution is to use a facility designed for dealing with syntax (such as reader macros).

This is just like how we separate compile-time from run-time: you use functions to define run-time stuff, macros to define compile-time stuff, and reader macros to define read-time stuff.

This also means that because the only way to change the syntax is via reader macros (or similar), the language designer is encouraged to provide a really slick, simple, easy-to-use system for extending the syntax, rather than awful kludgy reader macros.

-----

1 point by Pauan 5058 days ago | link

"then push (and subsequently pop -- the interpreter holds a stack of environments representing runtime frames)"

Uh oh, my warning bells went off. If I were you, I'd put some unit tests that verify that closures work properly. In particular, this might very well break in bullet (though I won't know without testing it):

  (def foo (x)
    (fn () x))

  ((foo 4)) -> 4

---

"They are expanded at runtime, and the result of the expansion is then evaluated in the lexical environment of the call site. That means you can use macros in higher-order functions; they truly are first class."

Ewww, runtime macros. I do not like. They combine all the awfulness of macros[1] without any of the benefits of fexprs[1], while also giving up the only benefit macros have[1]. The worst of all worlds, in my opinion.

---

"My intention had been to double back and fix the discrepancy with "real" lisps by doing the standard initial macroexpand traversal of the AST before evaluating."

Good. I think Lisps should either embrace macros (warts and non-first-classness included), or embrace fexprs and dump macros since they're not needed and just get in the way. Naturally, I'm in favor of fexprs unless speed is critical, and even then I'd prefer to just make the interpreter faster rather than dump the elegance of fexprs.

---

"I've just got to try these fexpr style macros; the idea of just controlling evaluation of operands, but otherwise being just like a regular function is very appealing."

It sure is! An example of a very beautiful Lisp that uses fexprs at its very core is Kernel (though it calls them operatives and uses the $vau form to create them):

http://web.cs.wpi.edu/~jshutt/kernel.html

http://www.wpi.edu/Pubs/ETD/Available/etd-090110-124904/unre...

ftp://ftp.cs.wpi.edu/pub/techreports/pdf/05-07.pdf

There are other Lisps that use fexprs (or at least things similar to fexprs) as well, such as Picolisp and newLISP (which erroneously calls them macros), but I'm especially fond of Kernel (for many reasons), but in part due to its static (lexical) scope.

---

* [1]: I'm only slightly exaggerating... but in all seriousness, first-classness is only one of the (multiple) benefits of fexprs, and even with first-class macros, you still need to worry about hygiene, which is basically a non-issue in Kernel (that is to say, in Kernel, hygiene is so incredibly easy to achieve that it naturally happens, because the language is so incredibly well designed, so I consider this a mostly "solved problem" in Kernel).

Plus, I suspect if you're basically macro-expanding macros at runtime, you'd actually get slightly faster speed with fexprs (not that speed is a huge issue, but it can be an issue, depending on what you want to do, so I mention it for completeness and because I have a personal interest in making powerful things go fast).

As far as I can tell, the only real benefit of macros is that they're always preprocessed, so they only need to macro-expand once. That is also why they're non-first-class.

I suppose a minor benefit is it allows you to treat macros as basically a template facility, but I find that benefit to be dubious at best, especially since it's so easy to use templating facilities in fexpr (or define your own).

Another minor benefit is that you can macro-expand a macro to do things like code walkers, but... I feel that should be part of a debugger/inspection suite or something.

---

Just to make sure you don't feel like I'm railing on you: it seems to me that you were unaware of fexprs when you designed bullet, hence why bullet has macros rather than fexprs. That's totally fine, I understand. I'm mentioning all these things not only for your benefit, but also anybody else who might stumble along and read this post.

-----

2 points by seertaak 5058 days ago | link

> In particular, this might very well break in bullet

It works:

    func foo (x): fn () x
    print ((foo 4)) // prints 4

I explained incorrectly: the interpreter env stack is basically a stack of bindings representing both "true" locals (i.e. locals on JVM) and environments representing lexical scopes. The latter are held for instance by functions, macros, modules explicitly, and also get implicitly created as required in e.g. looping primitives.

I'll reply to your other points tomorrow morning! (basically, I agree :))

-----

1 point by Pauan 5058 days ago | link

Nice! So lexical environments do form a proper tree and persist even after the outer function has returned? If so, then that shouldn't be a problem.

-----

1 point by Pauan 5059 days ago | link

"whoops, you've got fexprs. :-p"

Not if said "fexprs" don't have access to the dynamic environment, though!

-----

1 point by rocketnia 5059 days ago | link

"I agree about labeled break and continue -- I specifically mention them in the article BTW."

Oh, so you do! XD

I'm surprised and glad you understood the 'point stuff the first time I described it, lol.

-----