Arc Forumnew | comments | leaders | submitlogin
JavaScript Compiler, js.arc (w/ arc.js)
10 points by evanrmurphy 5039 days ago | 10 comments
I've been working on a JavaScript compiler for Arc. It is far from finished, but I've already been sitting on it for awhile and know some peer review would be a good idea, or as akkartik put it: "Seems stupid to be working alone like an alchemist when I have the perfect community to get feedback from." [http://arclanguage.org/item?id=11869]

To start with something familiar, here's js.arc (with html.arc) doing the Arc Challenge:

  (defop said ()
    (tag script
      (js `(def submit ()
             (= foo (document.getElementById "foo")
                document.body.innerHTML 
                ,(tostring
                   (jslink "click here"
                     `(= document.body.innerHTML
                       (+ |\'you said: \'| foo.value)))))
             nil)))
    (inputid "foo")
    (jsbut `(submit)))
This is kludgier than it could be,

  (defop said ()
    (inputid "foo")
    (jsbut `(= document.body.innerHTML
               ,(tostring
                 (jslink "click here"
                   `(= document.body.innerHTML
                       (+ "you said: "
                          (document.getElementById "foo").value)))))))
but issues with nested strings and dot ssyntax keep this from being possible at the moment.

Of course neither of the above competes with srv.arc for terseness, but then I guess it's misleading to cast two as competitors anyway since they're meant to work together.

'jsbut and 'jslink are a couple additions to html.arc just to make the onclick attribute more convenient:

  (def jslink (text (o clk text) (o dest "#"))
    (tag (a onclick (tostring:js clk) href dest) (pr text)))

  (def jsbut (clk (o text "submit") (o name nil))
    (gentag input onclick (tostring:js clk)
            type 'submit name name value text))
To better illustrate how js.arc works, how about seeing some simple expressions compile at the REPL? js.arc currently uses stdout like html.arc rather than return values, so I'll omit those for readbility:

  arc> (js '1)
  1
  arc> (js "foo")
  'foo'
  arc> (js '(+ 1 1))
  (1+1)
  arc> (js '(+ 1 (/ 2 3) (* 4 5) (mod 6 7)))
  (1+(2/3)+(4*5)+(6%7))
  arc> (js '(fn (x) x))
  function(x){return x;}
  arc> (js '(def foo (x) x))
  function foo(x){return x;}
  arc> (js '(foo 1))
  foo(1)
  arc> (js '(do (foo 1)
                ((fn (x) x))))
  (function(){foo(1);return (function(x){return x;})();})()
  arc> (js '(if a b c d e))
  if(a)b;else if(c)d;else e;
  arc> (js '(and x (or y z)))
  (x&&(y||z))
  arc> (js '(let x 1
              (alert x)))
  (function(x){return alert(x);})(1)
  arc> (js '(with (x 1 y 2)
                (document.write (+ x y))))
  (function(x,y){return document.write((x+y));})(1,2)
  arc> (js `(= x.innerHTML ,(tostring (tag html 
                                        (tag body 
                                          (tag p (pr "hello world")))))))
  (x.innerHTML='<html><body><p>hello world</p></body></html>')nil

And here's the complete source for js.arc:

  ; TODO
  ; fix nested strings/escaping, esp. to work with html.arc
  ; implement 'expand= etc. for more robust '=?
  ; for, while, switch/case, afn/rfn/accum, cons, quote/unquote?
  ; figure out js objects
  ;   make js objects into arc tables so we do (document!getElementById x) for document.getElementById(x)
  ;   or use fns so (document.getElementById x) is really ((document getElementById) x)
  ;   maybe allow dot after fn call like (document.getElementById x).value
  ; triple-check semicolons
  
  (mac w/braces body
    `(do (pr #\{) ,@body (pr #\})))
  
  (mac w/parens body
    `(do (pr #\() ,@body (pr #\))))
  
  (mac w/quotes body
    `(do (pr #\') ,@body (pr #\')))
  
  (mac w/semi body
    `(do ,@body (pr #\;)))
  
  (def js-infix (op args)
    (w/parens (between arg args (js op)
                (js arg))))
  
  (def js-fn (args body)
    (pr "function")
    (w/parens
      (between arg args (pr #\,)
        (js arg)))
    (w/braces
      (each s body
        (w/semi
          (if (is s (last body))
              (do (pr "return ")
                  (js s))
              (js s))))))
  
  (def js-def (name args body)
    (pr "function ")
    (js name)
    (w/parens
      (between arg args (pr #\,)
        (js arg)))
    (w/braces
      (each s body
        (w/semi
          (if (is s (last body))
              (pr "return "))
          (js s)))))
  
  ; old 'js-do used block, fn or block better? seems block more readable but fn matches arc's 'do
  
  (def js-do (args)
    (js `((fn () ,@args))))
  
  ; need to handle case of (len args) < 2?
  
  (def js-if (args)
    (do (pr "if")
        (w/parens (js (car args)))
        (w/semi (js (cadr args)))
        ((afn (xs)
           (if (no xs)
                nil
               (cadr xs)
                (do (pr "else if")
                    (w/parens (js (car xs)))
                    (w/semi (js (cadr xs)))
                    (self (cddr xs)))
               (do (pr "else ")
                   (w/semi (js (car xs)))
                   (self (cddr xs)))))
         (cddr args))))
  
  (def js-when (test body)
    (js `(if ,test (do ,@body))))
  
  (def js-unless (test body)
    (js `(if (no ,test) (do ,@body))))
  
  (def js-with (parms body)
    (js `((fn ,(map1 car (pair parms))
            ,@body)
          ,@(map1 cadr (pair parms)))))
  
  (def js-let (var val body)
    (js `(with (,var ,val) ,@body)))
  
  (def js-withs (parms body)
    (if (no parms) 
        (js `(do ,@body))
        (js `(let ,(car parms) ,(cadr parms) 
               (withs ,(cddr parms) ,@body)))))
  
  (def js-= (args)
    ((afn (args)
       (if (no args)
           nil
           (cadr args)
           (do
             (w/parens
               (js (car args))
               (pr "=")
               (js (cadr args)))
             (if (cddr args)
                 (pr #\;))
             (self (cddr args)))))
     args))
  
  (def js args
   (each s args
    (w/uniq (ga gs)
      (if (isa s 'string)      (w/quotes (pr s))
          (atom s)             (pr s)
          (atom (car s))
           (if (in (car s) '+
                 '- '* '/ '>=
                 '<= '> '<)    (js-infix (car s) (cdr s))
             (case (car s)
               mod             (js-infix '% (cdr s))
               is              (js-infix '=== (cdr s))
               and             (js-infix '&& (cdr s))
               or              (js-infix '\|\| (cdr s))
               fn              (js-fn (cadr s) (cddr s))
               def             (js-def (cadr s) (car:cddr s) (cdr:cddr s))
               do              (js-do (cdr s))
               if              (js-if (cdr s))
               when            (js-when (cadr s) (cddr s))
               unless          (js-unless (cadr s) (cddr s))
               with            (js-with (cadr s) (cddr s))
               let             (js-let (cadr s) (car:cddr s) (cdr:cddr s))
               withs           (js-withs (cadr s) (cddr s))
               =               (js-= (cdr s))
                               (do (js (car s))
                                   (w/parens (between arg (cdr s) (pr #\,)
                                               (js arg))))))
          (is (caar s) 'fn)   (do (w/parens (js-fn (cadr:car s) (cddr:car s)))
                                  (w/parens (between arg (cdr s) (pr #\,)
                                              (js arg))))))))
It was satisfying after I got some axioms laid down to be able to start copying macro definitions almost verbatim from arc.arc. For example, compare the definition of 'with from arc.arc with the above definition of 'js-with, which just passes the body of the former as an argument to 'js.

If you look closely at the above definitions, some are insufficient. The definition of 'js-unless depends on 'no, but 'no isn't defined anywhere else in the file. That's because until now I've neglected to mention a small JavaScript library, arc.js, that I've been using in conjunction with js.arc. It defines 't, 'nil, 'cons, 'car, 'cdr and some other arc.arc functions, and I'll post its souce as well:

  // js arrays have so many of these functions, wonder if better to use them instead of cons object
  // same with null for nil, true for t, && for and, || for or
  // make null's toString "nil"
  
  var t = true;
  
  // [], false or undefined instead?
  
  var nil = null;
  
  // should be for any number of args
  // redundant because in js.arc now
  
  function is (x,y) {
      if (x == y) { return t; }
      else { return nil; }}
  
  function no (x) { return is (x, nil); }
  
  function isnt (x, y) { return no (is (x, y)); }
  
  function cons (car, cdr) {
      return { car:car,
               cdr:cdr,
               toString: function () {
                             return "(" + this.car + " . " + this.cdr + ")"; },
               type: 'cons'
             }; }
  
  function car  (xs)  { return xs.car; }
  function cdr  (xs)  { return xs.cdr; }
  function caar (xs)  { return car(car(xs)); }
  function cadr (xs)  { return car(cdr(xs)); }
  function cddr (xs)  { return cdr(cdr(xs)); }
  
  function type  (x) { return x.type; }
  function acons (x) { return is (type (x), 'cons'); }
  function atom  (x) { return no (acons (x)); }
  
  function copylist (xs) {
      if (no(xs)) {
          return nil; }
      else { return cons(car(xs), copylist(cdr(xs))); }}
  
  function list () {
      var acc = nil;
      for (i = arguments.length; i > 0; i -= 1) {
          acc = cons (arguments[i-1], acc); }
      return acc; }
  
  function idfn (x) { return x; }
  
  function map1 (f, xs) {
      if (no (xs)) {
          return nil; }
      else { return cons (f (car (xs)), map1 (f,cdr (xs))); }}
  
  function pair (xs, f) {
      if (!f) { f = list; } // optional arg
      if (no (xs)) {
          return nil; }
      else if (no (cdr (xs))) {
          return list (list (car (xs))); }
      else { return cons (f (car (xs), cadr (xs)),
                          pair (cddr (xs), f)); }}
  
  // breaks on invalid keys?
  
  function assoc (key, al) {
      if (atom (al)) {
          return nil; }
      else if (acons (car (al)) && is (caar (al), key)) {
          return car (al); }
      else { return assoc (key, cdr (al)); }}
  
  function alref (al, key) {
      return cadr (assoc (key, al)); }
  
  // listtab for js arrays instead of hashes
  // shows how to do afn, rfn
  
  function listarray (xs) {
      return (function self (xs, acc) {
          if (no (xs)) {
              return acc; }
          else { return acc.concat (car (xs), self (cdr (xs), acc)); }
      }) (xs, []); }
  
  //function join () {
  //    var args = list.apply (this, arguments);
  //    if (no (args)) {
  //        return nil; }
  //    else {
  //        (function (a) {
  //            if (no (a)) {
  //                join.apply (this, listarray (cdr (args))); }
  //            else { return cons (car (a), join.apply (this, listarray (cdr (a)), listarray (cdr (args)))); }
  //        }) (car (args)); }}
  
  // workaround since above not working
  
  function join () {
      var acc = [];
      for (i = 0; i < arguments.length; i += 1) {
          acc = acc.concat (listarray (arguments [i])); }
      return acc; }
  
  function rev (xs) {
      return (function self (xs, acc) {
          if (no (xs)) {
              return acc; }
          else { return self (cdr (xs), cons (car (xs), acc)); }
      }) (xs, nil); }
  
  function alist (x) {
      return no (x) || is (type (x), 'cons'); }
  
  // not tested
  
  function reclist (f, xs) {
      return xs && (f (xs) || reclist (f, cdr (xs))); }
  
  // seems to return false too often
  
  function recstring (test, s, start) {
      if (!start) { start = 0; } // optional arg
      return (function self (i) {
          return (i < s.length) && (test (i) || self (i+1));
      }) (start); }
I'll probably end up merging the two files or finding some better way to organize them, because right now the separation feels a bit arbitrary and awkward.


2 points by rocketnia 5038 days ago | link

Very nice! I have a few bits of feedback in no particular order. ^_^

   function join () {
       var args = list.apply (this, arguments);
       if (no (args)) {
           return nil; }
       else {
  -        (function (a) {
  +        return (function (a) {
               if (no (a)) {
  -                join.apply (this, listarray (cdr (args))); }
  +                return join.apply (this, listarray (cdr (args))); }
  -            else { return cons (car (a), join.apply (this, listarray (cdr (a)), listarray (cdr (args)))); }
  +            else { return cons (car (a), join.apply (this, [ cdr (a) ].concat (listarray (cdr (args))))); }
           }) (car (args)); }}
(I think I prefer the other version of this, though. For one thing, it uses constant stack space.)

Also, note that if you're going to define your own version of falsity (no()), then the JavaScript && and || aren't necessarily going to play along with that. They'll treat "" as false, for instance.

Speaking of which, it seems weird to me that you'd have 'if expand to a statement like this:

  arc> (js '(if a b c d e))
  if(a)b;else if(c)d;else e;
You can have it expand to an expression instead, bringing it closer to the Arc version's functionality:

  (function(){
    if(isnt(nil,a))return b;if(isnt(nil,c))return d;return e;})()
  or simply
  (isnt(nil,a)?b:isnt(nil,c)?d:e)
On another note, it would be nifty if you could use Arc macros (or even functions, with an Arc hack to put enough metadata on them) in the s-expressions sent to the compiler. But since the namespace seen by the (js ...) Arc isn't nearly the same as the namespace seen by the raw Arc, it seems like the sort of undertaking that would completely change the structure of the code.

...but issues with nested strings and dot ssyntax keep this from being possible at the moment.

Well, the Scheme reader is going to parse (a (b c).d) as a three-element list the same way as it parses (a (b c) .d). I'm not even sure what kind of type we might expect '(b c).d to be. Nevertheless, I agree it ought to work. :-p

As far as nested strings go, could you elaborate on that?

-----

2 points by akkartik 5037 days ago | link

"it would be nifty if you could use Arc macros (or even functions, with an Arc hack to put enough metadata on them) in the s-expressions sent to the compiler."

I had the same reaction.

"But since the namespace seen by the (js ...) Arc isn't nearly the same as the namespace seen by the raw Arc, it seems like the sort of undertaking that would completely change the structure of the code."

Perhaps I'm missing something. I imagine you could implement a macro, say jsdef, to store translation functions in a table and then replace (def js-fn..) with (jsdef fn..). js would then replace keywords in car position with functions before eval-ing the whole shebang. This way you wouldn't need to update js everytime you want to implement a new arc function in js. You'd also be able to avoid quoting when you don't need backquotes.

Hmm, perhaps this runs into a similar problem to my yrc (http://arclanguage.org/item?id=11880) -- since you're doing keyword replace, is there a scenario where the keyword won't show up until after the replace step? My mind is getting bent out of shape thinking about this.

"Very nice!"

Seconded!

-----

1 point by rocketnia 5037 days ago | link

Yeah, I think essentially you could compile Arc to JavaScript approximately the same way it's compiled to Scheme. The issue I was referring to when I was talking about "the namespace[s]" is that a function may be defined in Arc proper but not in js.arc, or conversely, a function may exist on the JavaScript side (like alert()) that has no analogue in Arc. Either the library has to manage that or the programmer does, and it seems like the kind of thing a library should do.

...

Whoa, I'm now realizing just how especially intriguing this is to me; I've been getting into the thick of working on namespace organization for Blade, and this is really relevant for that. How might one go about having a language where programs may have access to significantly different core functionality depending on how they're compiled... but where it could be swapped out for a substitute implementation on a mismatched platform...

Oooh, a whole JavaScript primitive operation suite (indexing, +, -, *, certain global variables, JSON syntax, new, and so forth) could come in a parameter. When you call a 'jsdef function in Arc, the JS-runtime parameter can be obtained from some dynamic binding or global variable, and when you compile a 'jsdef function to JavaScript, references to that parameter can be given special treatment by the compiler. As for as Arc primitive operations, the Arc-to-JS compiler may be able to detect uses of those operations and translate them so they refer to some JavaScript-side global variable(s) defined in arc.js. I think this could work. ^_^

It isn't especially novel, I suppose, 'cause it still boils down to special-casing in the compiler, but at least it's limited to special-casing the JS-runtime parameter, which is part of the library itself.

To go on, I'm thinking that the primitive operations would be treated as faithfully as possible rather than approximated by similar operations in the other language. For instance, Arc numbers would be annotated strings or something in JavaScript, rather than being demoted to the width of a JavaScript number, and Arc + may throw an error in JavaScript if a bignum arithmetic library is unavailable. More language-lenient utilities can be built up on top of those. If it's annoying to come up with names for those because they clash with all the Arc-native names, well, that could indeed be a problem....

Thoughts?

-----

1 point by evanrmurphy 5038 days ago | link

Thank you for the feedback and for fixing 'join! ^_^

> Also, note that if you're going to define your own version of falsity (no()), then the JavaScript && and || aren't necessarily going to play along with that. They'll treat "" as false, for instance.

Glad you pointed this out, may save me some debugging time. :)

Speaking of creating new structures vs. using built-in ones, I have considered switching to fake conses. That is, instead of there being a separate cons object as there is now, cons, car, cdr etc. would be array manipulations that make JavaScript arrays act just (or almost) like Arc lists. Do you have an opinion on this?

Re: 'if, the current version might seem more intuitive to a JavaScripter, but your approach may be more consistent with the way I'm expanding 'do. And then there's your point, that the latter has a close 1-to-1 correspondence of expressions with Arc while the former breaks up an expression into several statements. (Is this what you meant, or have I misunderstood?)

> As far as nested strings go, could you elaborate on that?

In the first Arc Challenge attempt I posted is the line,

  (+ |\'you said: \'| foo.value)))))
The |\'you said: \'| from that is manual string escaping I'm doing because 'js naively compiles "you said: " to 'you said: ' without considering the need for backslashes. I think it's important to be able to have JavaScript within HTML within JavaScript within HTML within ... within JavaScript, and I'm not sure it's a particularly difficult string escaping problem - I just haven't gotten it working yet.

-----

2 points by rocketnia 5037 days ago | link

...array manipulations that make JavaScript arrays act just (or almost) like Arc lists. Do you have an opinion on this?

Well, to have cons produce an JavaScript array would be a bit odd. What happens to improper lists? (Well, Arc chokes on improper lists all the time, so maybe it doesn't matter. :-p )

I think it could be a good long-term idea to support both JavaScript arrays and the Arc cons type you already have. Functions like 'all, 'map, and 'join could work for both kinds of lists, and this way people wouldn't necessarily have to change the way they're using conses, and they could work closer to the metal (or whatever JS is ^_^ ) when that was more useful.

...the latter has a close 1-to-1 correspondence of expressions with Arc while the former breaks up an expression into several statements. (Is this what you meant, or have I misunderstood?)

I'm mostly just concerned that you should be able to say things like (let x (if (< foo 2) 0 foo) (* x x)). If (if ...) translates to a statement, that won't work.

|\'you said: \'|

Oh, I see now.

Yeah, the string escaping shouldn't be that difficult. I suppose the trickiest part about string escaping is remembering to do it. ^_^

-----

1 point by evanrmurphy 5036 days ago | link

Thanks again for everyone's feedback so far. I have to be without internet for a month but will be back on here afterward.

-----

1 point by random7 5038 days ago | link

Nice idea. Have you read through parenscript, which does something similar based on Common Lisp?

Also, what's the definition of "between"? It doesn't seem to be defined anywhere in arc3 or in js.arc.

-----

1 point by evanrmurphy 5038 days ago | link

Thanks. I have looked at ParenScript, but not as extensively as I probably should. Have you used it, or is there any aspect of their approach you might particularly recommend?

I'm also interested in Scheme2Js and have read much of their paper. [http://www-sop.inria.fr/mimosa/scheme2js/]

'between is courtesy of Andrew Wilcox [http://awwx.ws/between]. Nice catch. :)

-----

2 points by random7 5037 days ago | link

Thanks, it works now. (And thanks for the link to Scheme2js!)

I installed ParenScript, but haven't used it for anything.

I recommend it for having a community that actually uses it for production, and for being around long enough to have tried a variety of different approaches to hosting a lisp on top of JavaScript. It's probably worth reading through their mailing list archives to see how they've evolved.

I also recommend coffee-script, which is sort of a python/ruby syntax for Javascript. Probably the most inspiring idea from coffee-script is that the language is self-hosted (it was initially cross-compiled from ruby, but is now written in itself.)

-----

1 point by garply 5037 days ago | link

I have looked at parenscript and even started an arcscript which is totally non-functional - I was just translating it to Arc as I went as a way to understand the original codebase. I never finished it, but I'll post what I have on github and maybe we can collaborate.

Edit: pushed to github.

-----