Arc Forumnew | comments | leaders | submit | rocketnia's commentslogin

"An implementation of intermediate splicing would be something like[...]"

Which of these interpretations do you mean?

  (list 1 2 (splice 3 4) 5)
  -->
  (list 1 2 3 4 5)
  
  
  (list 1 2 (splice (reverse (list 4 3))) 5)
  -->
  (list 1 2 3 4 5)
I wrote the rest of this post thinking you were talking about the first one, but right at the end I realized I wasn't so sure. :)

---

"Namely, making the lists that form the code/ast have reverse links, so you can mutate above the macro call level, instead of just insert arbitrary code in place."

I'll make an observation so you can see if it agrees with what you're thinking of: The expression "above the macro call level" will always be a function call or a special form, never a macro call. If it were a macro call, we'd be expanding that call instead of this one.

---

For these purposes, it would be fun to have a cons-cell-like data structure with three accessors: (car x), (cdr x), and (parent x). The parent of x is the most recent cons-with-parent to have been constructed or mutated to have x as its car. If this construction or mutation has never happened, the parent is nil.

Then we can have macros take cons-with-parent values as their argument lists, and your macro would look like this:

  (mac splice list
    (= (cdr list) (cdr (parent list)))
    (= (cdr (parent list)) list))
Unfortunately, if we call (list 1 2 (splice 3 4) 5), then when the splice macro calls (parent list), it'll only see ((splice 3 4) 5). If it calls (parent (parent list)), it'll see nil.

---

Suppose we have a more comprehensive alternative that lets us manipulate the entire surrounding expression. I'll formulate it without the need to use conses-with-parents or mutation:

  ; We're defining a macro called "splice".
  ; The original code we're replacing is expr.
  ; We affect 1 level of code, and our macro call is at location (i).
  (mac-deep splice expr (i)
    (let (before ((_ . args) . after)) (cut expr i)
      (join before args after)))
If I were to implement an Arc-like language that supported this, it would have some amusingly disappointing consequences:

  (mac-deep subquote expr (i)
    `(quote ,expr))
  
  
  (list (subquote))
  -->
  (quote (list (subquote)))
  
  
  (do (subquote))
  -->
  ((fn () (subquote)))
  -->
  ((quote (fn () (subquote))))
  
  
  (fn (subquote) (+ subquote subquote))
  -->
  (fn (subquote) (+ subquote subquote))

reply


"The main difference to note in the box formulation is that there is no such thing as an environment."

Doesn't step 1 need to use environment(s)? How else would it turn text into a structure that contains references to previously defined values?

---

"The 'hyperstatic scope' that Pauan keeps mentioning is another way of saying, 'there is no scope', only variables."

The hyper-static global environment (http://c2.com/cgi/wiki?HyperStaticGlobalEnvironment) is essentially a chain of local scopes, each one starting at a variable declaration and continuing for the rest of the commands in the program (or just the file). I think the statement "there is no scope" does a better job of describing languages like Arc, where all global variable references using the same name refer to the same variable.

---

"Flexible scoping becomes possible by specifically referencing the variable in the environment instead of using the default boxed values. They would still be boxes, just referenced by symbol in the environment table."

I don't understand. If we're generating an s-expression and we insert a symbol, that's because we want that symbol to be looked up in the evaluation environment. If we insert a box, that's because we want to look up the box's element during evaluation. Are you suggesting a third thing we could insert here?

Perhaps if the goal is to make this as dynamic as possible, the inserted value should be an object that takes the evaluation environment as a parameter, so that it can do either of the other behaviors as special cases. I did something as generalized as this during Penknife's compilation phase (involving the compilation environment), and this kind of environment-passing is also used by Kernel-like fexprs (vau-calculus) and Christiansen grammars.

---

"Now what I'm wondering is whether there is any useful equivalence to be found between a box and an environment."

I would say yes, but I don't think this is going to be as profound as you're expecting, and it depends on what we mean by this terminology.

I call something an environment when it's commonly used with operations that look vaguely like this:

  String -> VariableName
  (Environment, VariableName) -> Value
  (Environment, Ast) -> Program
Meanwhile, I call something a box when it's primarily used with an operation that looks vaguely like this:

  Box -> Value
This might look meaningless, but it provides a clean slate so we can isolate some impurity in the notion of "operation" itself. When a mutable box is unboxed, it may return different values at different times, depending on the most recent value assigned to that box. Other kinds of boxes include Racket parameters, Racket continuation marks, Racket promises, JVM ThreadLocals, and JVM WeakReferences, each with its own impure interface.

When each entry of an environment needs to have its own self-contained impure behavior, I typically model the environment as a table which maps variable names to individual boxes. The table's get operation is pure (or at least impure in a simpler way), and the boxes' get operation is is impure.

You were wondering about equivalences, and I have two in mind: For any environment, (Environment, VariableName) is a box. For any box, we can consider that box to be a degenerate environment where the notion of "variable name" includes only a single name.

reply

1 point by Pauan 13 days ago | link

"Doesn't step 1 need to use environment(s)?"

I think he's talking about run-time environments a la Kernel, Emacs Lisp, etc.

Nulan and Arc/Nu use a compile-time environment to replace symbols with boxes at compile-time. But that feels quite a bit different in practice from run-time environments (it's faster too).

---

"Are you suggesting a third thing we could insert here?"

Once again, I think he's referring to run-time environments. Basically, what he's saying is that you would use boxes at compile-time (like Nulan), but you would also have first-class environments with vau. The benefit of this system is that it's faster than a naive Kernel implementation ('cause of boxes), but you still have the full dynamicism of first-class run-time environments. I suspect there'll be all kinds of strange interactions and corner cases though.

---

Slightly off-topic, but... I would like to point out that if the run-time environments are immutable hash tables of boxes, you effectively create hyper-static scope, even if everything runs at run-time (no compile-time).

On the other hand, if you create the boxes at compile-time, then you can create hyper-static scope even if the hash table is mutable (the hash table in Nulan is mutable, for instance).

reply


"I'm just lazy enough to wish that someone else had already done so :P"

It's funny, I've done exactly that recently, with the same goal of getting away from JavaScript with as little cruft as possible. My goal was to make a hackish little language (nevertheless better than JS) and then make a better language on top of it.

Awkwardly, by the time I had a nice reader, I realized I didn't have any particular semantics in mind for the little language! So I decided to work on the semantics I really cared about instead. :-p Although this semantics is still incomplete, the reader is there whenever I'm ready for it. Maybe it could come in handy for Wat.

I've put up a demo so you can see the reader in action: http://rocketnia.github.io/era/demos/reader.html

It's part of Era (https://github.com/rocketnia/era), and specifically it depends on the two files era-misc.js and era-reader.js.

It's yet another read-table-based lisp reader. Every syntax, including lists, symbols, and whitespace, is a reader macro. My implementation of the symbol syntax is a bit nontraditional, because I also intend to use it for string literals.

(For future reference, this is the most recent Era commit as of this posting: https://github.com/rocketnia/era/tree/ab4bf206c442ecbc645b38...)

-----

2 points by Pauan 14 days ago | link

You could also use Nulan's parser:

https://github.com/Pauan/nulan/blob/javascript/src/NULAN.par...

I specifically designed it so it can run stand-alone, without any dependencies on the rest of Nulan. It takes a string and returns a JS array of strings, numbers, and Symbols. You can easily write a simple function to transform the output so that it's compatible with wat.

You're also free to modify it, though I'm not sure what license to use for Nulan. It'll probably be either the BSD/ISC license or Public Domain.

-----

3 points by rocketnia 16 days ago | link | parent | on: Regular expressions in Arc

Your train of thought is very similar to a factor in several of my designs over time: Jisp[1], Blade[2], Penknife[3], Chops[4], and now the syntax I'm planning to use with Era[5].

I've always used use bracket nesting to determine where the syntax's body begins and ends. This way, most code requires no escape sequences, and the few times escape sequences are necessary, at least they stand a chance of being consistent when cutting-and-pasting across staging levels.

  (re /\)*/)       ; Broken.
  (re /(?#()\)*/)  ; Fixed with a regex comment. (Does this work?)
  (re /\>*/)       ; Fixed by redesigning the escape sequence.
Reader macros are a different approach, where the entire rest of the file/stream has unknown syntax until the reader reaches that point. That's simple in its own way, but I prefer not to make my code that lopsided in its semantics I guess. :)

(EDIT: Half an hour after I posted this, I realized I had a big mess of previous drafts tacked onto the end. I've deleted those now.)

---

[1] Jisp was one of the first toy languages I made, and it was before I programmed in Arc (or any other lisp). When it encounters (foo a b c), it resolves "foo" to an operator and sends "a b c" to that operator.

  > (if (eq "string (doublequote string)) (exit) 'whoops)
  [The REPL terminates.]
[2] Blade didn't get far off the ground, but it was meant to have a similar parser, with the explicit goal of making it easy to combine several languages into a single compiled program, with an extra emphasis on having no accidental code-order-dependent semantics at the top level. I switched to square brackets-- [foo a b c] --since these didn't require the shift key.

  [groovy
      import com.rocketnia.blade.*
      
      define( [ "out", "sample-var" ], BladeString.of( "sample-val" ) )
  ]
[3] Penknife was meant to be a REPL companion to Blade, and I developed the syntax into a more complicated, Arc-like combination of infix and prefix syntaxes (http://www.arclanguage.org/item?id=13071). Penknife was complete enough that I used it as a static site generator for a while. However, at this point I realized all the custom syntax processing I was doing was really killing the compile-time performance.

  arc.prn.q[Defining fn-iferr.]
  [fun fn-iferr [body catch]
    [defuse body [hf1:if-maybe err meta-error.it
                   catch.err
                   demeta.it]]]
  
  arc.prn.q[Defining iferr.]
  [mac iferr [var body catch]
    qq.[fn-iferr tf0.\,body [tf [\,var] \,catch]]]
[4] Chops is a JavaScript library that achieves Blade-like parsing without any goals for infix treatment or general-purpose programming. I use it as a markup language and a JavaScript preprocessor now that my static site generator runs in the browser.

  $rc.rcPage( "/", $cg.parseIn( [str RocketN[i I]A.com] ),
      "19-Nov-2012", $cg.parseIn( [str 2005[en]2010, 2012] ),
      { "title": "RocketNIA.com, Virtual Index of Ross Angle",
          "breadcrumbs": $cg.parseIn(
              [str RocketN[i I]A.com: Virtual Index of Ross Angle] ) },
      $cg.parse( [str
  
  ((This is the open source version of my site.
  [out http://www.rocketnia.com/ The online version] has a bit more
  content.))
  
  ...
  
      ] ) )
[5] Era is a module system I'm making, and I intend to compile to those modules from a lisp-like language. I've switched back to parentheses-- (foo a b c) --because smartphone keyboards tend to omit square brackets. The code (foo a b c) parses as a four-element list of symbols in the hope of more efficient processing, but the code foo( a b c) parses as a single symbol named "foo( a b c)".

-----

1 point by akkartik 16 days ago | link

The bouncing between parens and square brackets is interesting ^_^ I weakly feel it's not worth optimizing for what's easy to type because things can change (or be changed, with keybindings, etc.) so easily. Better to optimize for how things look. But even there, parens vs brackets is in the eye of the beholder.

-----

1 point by akkartik 16 days ago | link

"I've always used use bracket nesting to determine where the syntax's body begins and ends."

I have no idea what you mean by bracket nesting, or by staging levels. It also wasn't clear what the escape sequence is in the third example.

-----

3 points by rocketnia 16 days ago | link

"bracket nesting"

Whoops, I can't believe I didn't use the phrase "balanced brackets" instead. ^_^

The following two pieces of text may be similar, but I'd give them significantly different behavior as code:

  (foo a b (bar c d) (baz e) f)
  (foo a b bar c d) (baz e f)
My systems don't provide any way to pass the string "a b bar c d) (baz e f" to an operator.

---

"staging levels"

Staged programming is where a program generates some code to run later on, perhaps as a second program in some sense--especially if that boundary is enforced by a need to serialize, transmit, or sandbox the second program rather than executing it here and now. Staged programming has some implications for syntax, since it's valuable to be able to see the code we're generating.

Most languages use " to denote the beginning and end of a string, so they can't also use " to represent the character " inside the string. This can makes it frustrating to nest code within code. I'll use a JavaScript example.

  > eval( "eval( \"1 + 2\" )" )
  3
  > eval( "eval( \"eval( \\\"eval( \\\\\\\"1 + 2\\\\\\\" )\\\" )\" )" )
  3
While all these stages are JavaScript code, they all effectively use different syntax. It's not easy to copy and paste code from one stage to another.

Suppose we identify the end of the string by looking for a matching bracket, possibly with other pairs of matched brackets in between. I'll use ~< and >~ as example string brackets.

  > eval( ~<eval( ~<1 + 2>~ )>~ )
  3
  > eval( ~<eval( ~<eval( ~<eval( ~<1 + 2>~ )>~ )>~ )>~ )
  3
This fixes the issue. The same code is used at every level.

In JavaScript, technically we can implement delimiters like these if we're open-minded about what a delimiter looks like. We just need a function str() that turns a first-class string into a string literal.

  > str( "abc" )
  "\"abc\""

   Open string:  " + str( "
  Close string:  " ) + "

  > eval( "eval( " + str( "1 + 2" ) + " )" )
  3
  > eval( "eval( " + str( "eval( " + str( "eval( " + str( "1 + 2" ) + " )" ) + " )" ) + " )" )
  3
Now the code is consistent! Consistently infuriating. :-p

In Arc, we delimit code using balanced ( ). The code isn't a string this time, but the use of balanced delimiters has the same advantage.

  > (eval '(eval '(eval '(eval '(+ 1 2)))))
  3
This advantage is crucial in Arc, because any code that uses macros already runs across this issue. A macro call takes an s-expression, which contains a macro call, which takes an s-expression....

Since we're now talking about macros that take strings as input, let's see what happens if Arc syntax is based on strings instead of lists.

  > (let total 0 (each x (list 1 2 3) (++ total x)) total)
  6

  > "let total 0 \"each x \\\"list 1 2 3\\\" \\\"++ total x\\\"\" total"
  6
If we use balanced ( ) to delimit strings, we're back where we started, at least as long as we don't look behind the curtain.

  > (let total 0 (each x (list 1 2 3) (++ total x)) total)
  6
If you want working code for a language like this, look no further than Penknife. :)

---

"It also wasn't clear what the escape sequence is in the third example."

Are you talking about this one?

  (re /\>*/)
The original code would be broken in my approach because it uses ) in the middle of the regex, so the macro's input would stop at "/\". This fix addresses the issue by using a hypothetical escape sequence \> to match a right parenthesis, rather than using the standard escape sequence \).

If you're talking about my Penknife code sample, the "qq." part is quasiquote, and the \, part is unquote. Quasiquotation is relatively complicated here due to the fact that it generates soup, which is like a string with little pieces floating in it. :-p Penknife has no s-expressions, so it was either this foundational kludge or the hard-to-read use of manual AST constructors.

It's hard to count these examples with a whole number. XD Let me know if you were talking about my Blade code sample (the third code block in the post) or my Jisp code sample (third if you count the two example regex fixes separately).

-----

1 point by akkartik 16 days ago | link

Super useful, thanks. Yes, you correctly picked the code I was referring to :)

That issue with backslashes you're referring to, I've heard it called leaning toothpick syndrome.

-----

2 points by rocketnia 16 days ago | link

"Yes, you correctly picked the code I was referring to :) "

Which of my guesses was correct? :-p

---

"That issue with backslashes you're referring to, I've heard it called leaning toothpick syndrome."

Nice, I hadn't heard of that! http://en.wikipedia.org/wiki/Leaning_toothpick_syndrome

It might be worth pointing out that the LTS appearing in my examples is more pronounced than it needs to be. The usual escape sequence \\ for backslashes creates ridiculous results like \\\\\\\". If we use \- to escape \ we get the more reasonable \--" instead, and then we can see the nonuniform nesting problem without that other distraction:

  > eval( "eval( \"1 + 2\" )" )
  3
  > eval( "eval( \"eval( \-"eval( \--"1 + 2\--" )\-" )\" )" )
  3
Here's another time I talked about this: http://arclanguage.org/item?id=14915

-----


I assume by "s-expressions" we mean nested lists whose first elements are often symbols which represent operators. Apparently, Ruby already has built-in support for that code representation: http://stackoverflow.com/questions/6894763/extract-the-ast-f...

-----

1 point by Pauan 32 days ago | link

Well then, using that it'd be possible to make Ruby macros.

-----

2 points by rocketnia 31 days ago | link

Hmm, even if the package for doing this comes with Ruby MRI, I'm not sure there's a corresponding out-of-the-box way to take these s-expressions and get running code. But here's a five-year-old article talking about Ruby2Ruby, a library that converts these s-expressions to strings: http://www.igvita.com/2008/12/11/ruby-ast-for-fun-and-profit...

I don't think a third-party tool should count as language support, unless (and inasmuch as) the people maintaining the language regularly go out of their way to avoid breaking that tool.[1][2] This is where I draw the line, because of how it affects backwards compatibility.

Everyday language users tend to think of language updates as backwards-compatible as long as their existing code has the same behavior. But tools like eval(), JavaScript minifiers, and Ruby2Ruby place the bar higher, since their input is arbitrary language code, and it might be written in a newer language version than the tool itself. Incidentally, even a program that calls these tools will continue to work after a language update, as long as the input of these calls is all generated programmatically, rather than directly taken from the program's input.

[1] Well, I guess the term "third-party" might not apply in that case!

[2] I have no idea how much I'm actually talking about Ruby2Ruby, and this Ruby s-expression format in general, when I make this comment.

-----

2 points by rocketnia 36 days ago | link | parent | on: New version of Arc/Nu

Sounds like a good topic, and I'd be interested in hearing to hear your thoughts on this. I've been thinking about the same kind of thing, but I'm looking to use it in a weakly typed subset of a statically typed language. My motivation is mostly to see if we can make it convenient to externally manage the memory needs of otherwise encapsulated programs.

This is probably a discussion for another thread. ^_^;

-----

1 point by rocketnia 36 days ago | link | parent | on: New version of Arc/Nu

I wonder if the compiler can't see through a 'procedure-rename call to realize it can still apply optimizations to the code. Like, maybe it can optimize ((lambda (x) ...) 2) but not ((procedure-rename (lambda (x) ...) 'foo) 2).

-----

2 points by rocketnia 38 days ago | link | parent | on: What is the arity of [uniq]?

"What do people think of this? Basically I want all list operations to treat atoms as degenerate dotted lists. Does this have any adverse implications for other aggregates (tables)?"

I don't see a problem, but I may not be the right person to ask. If it came down to it, I'd be willing to use 'table-pushnew, 'alist-pushnew, 'maxheap-pushnew, and so on. :-p

What other list operations do you have in mind? While 'pushnew makes sense to use with degenerate dotted lists, it's a very special case: It only clobbers the list at a finite depth, and that depth is actually zero. (Or zero in the unmodified list, anyway.)

-----

1 point by akkartik 38 days ago | link

"What other list operations do you have in mind?"

Well, I've already changed reclist and some so things like this work:

  > (find 3 3)
  3
  > (find 3 '(2 . 3))
  3
  > (find 4 '(2 . 3))
  nil
More: https://github.com/nex3/arc/blob/76d078bcd0/arc.arc.t. I expect I've still missed some cases, and I'll keep tweaking them as I run into them until someone objects.

-----

3 points by rocketnia 38 days ago | link

Oh, whoops. I forgot 'pushnew actually does traverse the input list to find out whether the element is new. I was thinking of 'push. XD

I disagree with the notion of membership you're using for dotted lists. If I 'pushnew nil onto (1 2 3 . nil), I want to get (nil 1 2 3 . nil), and it won't work that way if nil is already considered to be a member.

-----

1 point by akkartik 37 days ago | link

Hmm, it seems to work:

  arc> (= x '(1 2 3))
  (1 2 3)
  arc> (pushnew nil x)
  (nil 1 2 3)

-----

2 points by rocketnia 36 days ago | link

I think the membership check you're using is like this:

  (The final cdr is an element iff it isn't nil.)
  If the list is...
    A cons cell:
      If the car is the value we're looking for, succeed. Otherwise,
      continue by searching the cdr.
    Nil:
      Fail.
    A non-cons, non-nil value:
      Continue by comparing the list to the value we're looking for.
I feel we could simplify this specification by rewriting the last two cases in one of these ways, ordered from my least to most favorite:

  (The final cdr is an element.)
  (This breaks (pushnew nil ...) in existing Arc code.)
  ...
    A non-cons value:
      Continue by comparing the list to the value we're looking for.
  
  (The final cdr is not an element.)
  ...
    A non-cons value:
      Fail.
  
  (The final cdr is always nil and is not an element.)
  (Arc 3.1 already uses this.)
  ...
    Nil:
      Fail.
    A non-cons, non-nil value:
      Raise an exception.
By using the "element iff it isn't nil" approach, you're able to use 'pushnew to traverse the simple argument lists you build as intermediate results of that 'make-br-fn implementation. But I don't know if it's worthwhile to complicate the notion of "list" just to accommodate a special case of argument lists.

-----

1 point by akkartik 36 days ago | link

Yeah, that's a valid summary of what I've done.

"I don't know if it's worthwhile to complicate the notion of "list" just to accommodate a special case of argument lists."

Yeah I see your point. My aim was to extend the correspondence between the syntax for vararg params, rest params and lists of regular params. But those features merely match a (proper) list of args to some template. I'm entangling the template with the notion of a list. Hmm..

-----

3 points by rocketnia 40 days ago | link | parent | on: What is the arity of [uniq]?

"Frankly, I think that bracket functions should always have at least one argument permitted."

Groovy does this. The Closure syntax { ... } is shorthand for { it = null -> ... }, which is a one-argument Closure whose argument "it" defaults to null. When I moved from Groovy to Arc, I missed this.

-----

3 points by rocketnia 40 days ago | link | parent | on: What is the arity of [uniq]?

"Well, how did the change not break setforms in arc.arc? Was the original definition for make-br-fn silently overwritten somewhere?"

Yes. The [...] syntax is redefined in make-br-fn.arc after arc.arc is loaded.

I described my opinion of this at https://github.com/nex3/arc/commit/957501d81bafecab070268418.... It did break my code, but I don't mind breaking changes these days (maybe because I'm hardly a user!), and I think a:b syntax is already more useful than either version of [...].

-----

More