Arc Forum | Wart update

Arc Forum

new | comments | leaders | submit

Wart update

5 points by akkartik 4688 days ago | 34 comments

1. '=' is now equality.

2. Assignment is '<-'.

3. 'or=' is now 'default..:to'.

4. Inequality is '~='.

5. The comment character is now '#'. This is consistent with more languages, and feels more unix-y with the shebang and all.

Here's how pos currently looks in wart:

  # return the index of x in seq, or nil if not found
  def pos(x seq n)
    default n :to 0
    if seq
      if (car.seq = x)
        n
        (pos x cdr.seq n+1)

To try it out:

  $ git clone http://github.com/akkartik/wart.git
  $ cd wart   # edit; thanks fallintothis!
  $ git checkout f0e3d726eb
  $ ./wart
  wart>

2 points by fallintothis 4685 days ago | link

OK, tried writing a program in wart so I could give my proper impressions / bug reports. I ported https://bitbucket.org/fallintothis/qq to http://pastebin.com/zfZhQ0tS. I think I have it right, except that testing the optimization code-paths is impossible due to certain bugs (see point #9 below). Does it look idiomatic?

1) The slowest part of Wart I'm finding is just starting the REPL. Running programs hasn't been an issue yet. However, it takes a really long time to start up, which wouldn't be nearly as bad if it didn't die at innocuous errors, like unbound symbols. Could more errors be captured by the REPL so I don't have to keep restarting slow things and losing my place?

2) Bug report: loading non-existent filenames causes wart to hang. Luckily, I had an inkling that was the case when I tried loading a path that used a tilde instead of the proper path to my home directory. As it turns out, it will hang just in general.

  wart> (load "loldoesnotexist")<CR>
  <CR>
  <CR><CR><CR> # hanging

3) Your vimrc.vim really isn't the right way to distribute language-specific features. See

  :h syn-files
  :h 44.11
  :h ftplugin

4) Typically, generics dispatch on the type of an argument, not an arbitrary predicate. Your generics are interesting and certainly in the spirit of other generics, but I wonder if their implementation is too broad. Combined with the way that predicates are tested in the reverse order of their declarations, it can create some unintuitive behavior:

  wart> def g(x) (prn "default")
        def g(x) :case (x = 1) (prn "one")
  (object function {body, sig, })
        def g(x) :case (odd? x) (prn "odd")
  (object function {env, body, sig, })
        
  (object function {env, body, sig, })
  wart> g 3
        
  odd
  "odd"
  wart> g 2
        
  default
  "default"
  wart> g 1
        
  odd
  "odd"

5) Why the outlandish indentation here? :) https://github.com/akkartik/wart/blob/master/045generic.wart...

6) Indentation sensitivity makes sense in a "line based" language (e.g., Python), but I find it horribly confusing in an "expression based" language (e.g., Lisps). While your auto-parentheses rule is simple to state, it's an extra mental burden for me to separate where parentheses should appear, when they don't, how to group everything (when I have few/mixed visual cues), etc. That it's optional means I could get by in principle, but I'm compelled to follow conventions, which seem largely aligned on the "no parens" side. It's a muddled blend of both worlds, and I honestly don't like the way it turns out. But we already knew that I'm a stick in the mud about this: http://arclanguage.org/item?id=16769

7) Despite the lack of parentheses, I felt like I was hitting the shift key a lot to write wart code. Chiefly for underscores and capitalization. (Did I mention I like having hyphens for identifiers? :P) By the actual count of shift-characters in qq.wart vs qq.arc, it's actually about the same or a little more in wart. But there are extenuating circumstances: extra comments & functions, plus some longer identifier names I changed when it became a pain to type (e.g., I changed unquote_splicing to unqs). But it's probably more-or-less a wash.

8) The numbered filenames really threw a wrench into my workflow. Without fuzzy matching, it's obnoxious to try to open a file only to find I'm unable to tab-complete the name because I just see the mnemonic part, like "generic.wart" instead of "[arbitrary number]generic.wart". I can't really tell what they're good for other than having nostalgia for BASIC (surely a good model to emulate :P) or overlaying some sort of narrative on your source code. And if there's a narrative, I (evidently) am not going to follow it anyway. I'd rather (at this point) grep for specific things than sit down and read the source code straight through like a book. The narrative could live independently of the filenames, like a separate load-order list. If the need for a specific load-order is technical, why not enforce it on the compiler level with a def-before-use policy? I suppose that breaks the "top down" approach from point "e" in 000preface, though.

9) Long-form quote is broken:

  wart> 'x
        
  x
  wart> '1
        
  1
  wart> (quote x)

  013object.cc:53 can't coerce symbol ' to function
  021eval.cc:59 Not a call: (quote x)
  Perhaps you need to split the line in two.
  dying

  wart> (quote 1)

  013object.cc:53 can't coerce symbol ' to function
  021eval.cc:59 Not a call: (quote 1)
  Perhaps you need to split the line in two.
  dying

Hope these help. :)

-----

1 point by akkartik 4685 days ago | link

Thanks a lot for the comments! In particular you've got me thinking about the numeric filename prefixes. The fatal errors have been getting onerous for me as well, thanks for drawing my conscious attention to them.

Thanks for the crazy load bug. I'll fix it later today.

Yes, loading wart is painfully slow. It's gotten 2x slower since the infix changes. Startup isn't doing anything special, it's just that the wart 'prelude' is a lot more code (since it's most of the language) than we typically type into the repl. I'll work on this some and get back to you.

You're right that relying on load order for generic dispatch makes some cases weird. On the flip side, it's powerful enough to permit multiple dispatch. Like the regex idea in http://arclanguage.org/item?id=16833 that permits either arg to be a regex. When I experimented with generic functions in the past I was always unsure whether to dispatch on the first or last arg. Things like keep like the last arg to be generic, for example. I think the problem isn't predicate dispatch, but checking predicates in load order. Is there a way to make the clauses load-order independent? Or an elegant way to influence load order without moving code around? Hmm, I'm going to think about this.

Regarding 6: I really don't intend to make fully-parenthesized lisp some sort of second class citizen. Feel absolutely free to stick with it for code you write. And I'd love to hear if it makes reading existing code more painful.

Regarding 5: I intended to show that construct_macro_call is an internal detail. I would move it later if I could, but it's used when extending def later in the file. I could a) just unindent it where it is, b) move it after mac and unindent it, c) move it to the bottom of the file, and not use the new mac to extend def (because that wouldn't work yet). Somehow I don't like any of those choices. What do you think?

Regarding 7: It didn't occur to me to watch keypresses on shift. That wasn't a motivator at all for eliminating parens or anything else. (I've got parens swapped with square brackets in my local environment for lisp.) In general, wart is more concerned with reading than with writing. But I'm going to think about it.

Regarding 9: Yes, this is a little weird when you come from other lisps. Wart doesn't consider ' to be an abbreviation for quote. It's the other way around; ' is part of the core syntax (it's processed during tokenization) along with backquote and unquote. The name quote is just for manipulating code. Things like if (car.form = quote), etc. I used to use it in my inliner/partial evaluator, but the inliner is not even in there anymore, and there's really no reason for code-manipulation stuff to be available so early on. I'm just going to take the long forms out. [Edit: done]

Another difference with traditional lisp: 'a is read as (' . a) rather than (' a). Similarly for backquote and unquote. This allows me to sidestep complexities like what happens in (' a b) or ,@,@x that I haven't found a use for yet.

-----

2 points by fallintothis 4684 days ago | link

Is there a way to make the clauses load-order independent?

The Magpie stuff Pauan linked is really interesting. From my layman's intuition, though, it still only seems to be able to linearize clauses because they're limited. As long as we know what sort of clauses we're working with, we can define a partial order on them. The same also happens to hold true for "typical" generic methods that dispatch based on class, because classes readily form a poset (using the subclass relationship). I have a feeling that the answer would be somewhere around "intractable" for arbitrary code, especially if the code didn't have formal semantics we could reason about. I mean, arbitrary code might never halt, so it'd be impossible to linearize those cases to the "end of the line" just because (in general) the Halting Problem is undecidable. But hey, I'm not a doctor.

At least the order that definitions appear in the code easily linearizes the clauses, despite the elephant-in-the-room of loading multiple files.

Regarding 6: I really don't intend to make fully-parenthesized lisp some sort of second class citizen.

That's good. I didn't mean to mischaracterize. I think that might inadvertently be the impression (or even outcome?) from so much as having a no-parens rule: the thought goes from "let's do Lisp with fewer parentheses" to "let's do Lisp with no parentheses", slippery slope fallacy though that is. Plus, at a glance, their elision seems common in your source.

And I'd love to hear if it makes reading existing code more painful.

I think that's a big reason I don't like removing them. There are fewer visual cues for grouping expressions. Not as big a deal in something where indentation is all you have, like Python. A little weirder when my head's trying to fill in where parentheses should rightly be. Blah blah, curmudgeon, blah.

Regarding 5: What do you think?

It only has the one call site; could you just inline it using afn?

Regarding 7: It didn't occur to me to watch keypresses on shift.

It's more of a heuristic than a solid metric, but I definitely notice it. E.g., I get annoyed working in camelCase languages.

Regarding 9: I'm just going to take the long forms out.

So, how do I manipulate quotes in macroexpansions, as needed in qq.wart?

-----

2 points by akkartik 4673 days ago | link

"how do I manipulate quotes in macroexpansions, as needed in qq.wart?"

Yes, you're right that I need to bring back my definitions of quote, etc for qq.wart. But there was more to it than that. Here is -- I think -- a relatively faithful port of qq-expand including the optimizer: http://gist.github.com/3971932#file_080qq.wart. I say 'relatively' because wart doesn't support multiple args to quote/unquote/quasiquote, so a lot of your hardest tests become moot. Also, the backquote doesn't expand to a call to quasiquote in wart.

Sorry this took so long. Even when you provided such thorough tests it was hard to make myself focus on this exercise; there's always alternatives with more instant gratification. But in the end it was a fun exercise in recreating the 'theory' of a (tiny) codebase (http://alistair.cockburn.us/ASD+book+extract%3A+%22Naur,+Ehn...)

---

Rather than start with your port I went back to your arc sources and tried to track my process in an effort to generalize lessons. I was able to split your recursive functions into clauses without much effort[1]. For example:

  ; Arc
  (def qq-cons (expr1 expr2)
    ; assume expr2 is non-splicing
    (let operator (if (splicing expr1) 'dotted-list 'cons)
      (if (no optimize-cons*)
           (list operator expr1 expr2)
          (and (~splicing expr1) (literal expr1) (no expr2))
           (list 'quote (list (eval expr1)))
          (no expr2)
           (list 'list expr1)
          (atom expr2)
           (list operator expr1 expr2)
          (caris expr2 'list)
           (dotted-list 'list expr1 (cdr expr2))
          (and (quoted-non-splice expr1) (quoted-non-splice expr2))
           (list 'quote (cons (cadr expr1) (cadr expr2)))
          (list operator expr1 expr2))))

  # Wart; an early incorrect version
  def qq_cons(expr1 expr2)
    # assume expr2 is non-splicing
    let operator (if splicing?.expr1 'dotted_list 'cons)
      (list operator expr1 expr2)

  def qq_cons(expr1 expr2) :case (and Optimize_cons quoted_non_splice?.expr1
                                                    quoted_non_splice?.expr2)
    (list quote (list cdr.expr1 cdr.expr2))

  def qq_cons(expr1 expr2) :case (and Optimize_cons (carif.expr2 = 'list))
    (dotted_list 'list expr1 cdr.expr2)

  def qq_cons(expr1 expr2) :case (and Optimize_cons atom?.expr2)
    let operator (if splicing?.expr1 'dotted_list 'cons)
      (list operator expr1 expr2)

  def qq_cons(expr1 expr2) :case (and Optimize_cons no.expr2)
    (list 'list expr1)

  def qq_cons(expr1 expr2) :case (and Optimize_cons ~splicing?.expr1
                                                    literal?.expr1
                                                    no.expr2)
    (list quote (list eval.expr1))

The key was to peel the cases off in reverse order.

As in the past, wart is more verbose than arc, but in exchange you get to move the cases around. For example, all but the first clause can be in a separate file so you don't care about them when Optimize_cons is unset.

---

But there was more to this than a simple transliterated port, because quote/unquote/backquote were represented differently in wart than other lisps. I had to track down and update all the places in your version that made this foundational assumption.

Since the backquote is baked into wart's reader rather than expanding to quasiquote like in other lisps, I traced through all top-level calls to qq-expand as a baseline to compare against:

  ; change in original arc version
  (mac quasiquote (expr)
    (ret ans qq-expand.expr
      (ero expr " => " ans)))

Once I had this output in hand I could start porting your tests. I started with a pass at just visually catching as many cases of treating the cdr of quote, quasiquote, etc. as a list as I could[2]; you might enjoy seeing what I was missing at that point, the 5 tokens that took a week of debugging :)

Two forms turned out to be quite useful in my debugging:

  # http://arclanguage.org/item?id=11140
  after_fn qq_expand_list(expr)
    (prn "qq_expand_list " expr "
  => " result)
  # ..and so on for qq_transform, qq_cons, qq_append.

  # Easily turn debug prints on and off.
  def xprn args
    nil

---

References are to snapshots of the same gist:

[1] https://gist.github.com/3971932/f85f4da34d79d7d6cb1c9b01ce60.... Try diffing this version against your port.

[2] https://gist.github.com/3971932/72d159184afaa68e42920801b75e...

-----

2 points by Pauan 4684 days ago | link

"As long as we know what sort of clauses we're working with, we can define a partial order on them."

You could require that the user provide an order when defining a new pattern.

---

My idea with Nulan is that functions shouldn't change: they should be static algorithms. Thus, Nulan doesn't allow you to overload functions like you can in Arc or Wart. Instead, if you wish to overload a function, you define a new gensym which can be stuck into any object to overload it:

  $def %foo (uniq)

  $def foo
    { %foo F } -> ...
    ...        -> ...

Now, if you call the "foo" function with an object that has a %foo key, it'll call the first clause. Otherwise it'll call the second clause.

This gives complete control to the function to decide whether it can be overloaded or not, and exactly how the overloading happens. I think this is the most flexible approach I've ever seen.

As a more concrete example, objects can have a %len key to define a custom (len ...) behavior, an %eval key for custom evaluation behavior, %call for custom call behavior, %pattern-match for custom pattern match behavior, etc.

And because of the way pattern matching works, you can even match against multiple gensyms at once:

  $def %foo (uniq)
  $def %bar (uniq)

  $def foo
    { %foo F
      %bar G } -> ...
    { %foo F } -> ...
    { %bar G } -> ...
    ...        -> ...

The above defines different behavior when an object has both the %foo and %bar keys, and also when it has just %foo or just %bar.

---

"So, how do I manipulate quotes in macroexpansions, as needed in qq.wart?"

I think hardcoding symbols (like in quasiquote) is a terrible idea, so I would just not define quasiquote. Instead, I'd use Arc/Nu quasisyntax, which doesn't use quote at all:

  `(foo bar qux)   -> (list foo bar qux)
  `(foo (bar) qux) -> (list foo (list bar) qux)

If you want quote, just use unquote:

  `(foo ,'bar qux) -> (list foo (quote bar) qux)

Also, I wouldn't define quasiquote as a macro, I'd have it be a part of the reader.

-----

3 points by fallintothis 4683 days ago | link

You could require that the user provide an order when defining a new pattern.

Interesting idea, but the proper mechanism for it eludes me.

- You could have the user give a precedence level, since integers are totally ordered, but that's a terrible idea---magic constants in every generic declaration. Any more restricted domain (e.g., specifying high, medium, or low precedence) gets a little vague.

- You could have a simpler mechanism where each new rule is added to a known, fixed location in the linearization. So basically the order is a double-ended queue and generic declarations have keywords for "push me to the front" or "push me to the back". But that's probably a bit too basic, and still relies on declaration order (in fact, complicating it).

- You could have the user specify a previously-declared chunk of code to execute first, like

  def g(x)
    (prn "default")

  def g(x) :case (odd? x)
    (prn "odd")

  def g(x) :case (x = 1) :before (odd? x)
    (prn "one")

But that's too tightly-coupled and requires code duplication. Plus, if you're already duplicating the code that's close by a definition, why not instead reorder the definitions so a simpler order-they're-read mechanism works?

- I'm really scraping the bottom of the barrel now...Have the user supply their own predicate that determines which generic to apply first? Which means the user basically has to implement their own customized partial-order. I really don't see that happening.

When using generics myself, I don't want to think about these sorts of things too hard. That's what I like about class-based generic dispatch: I just have to think about the function one class at a time, and the right function will be applied to the right instances using their simple, implicit order. For general predicates, instead of hard-coding an order I'd rather use a big if/case/pattern-match that I at least know is ordered the way I wrote it.

-----

4 points by rocketnia 4683 days ago | link

"You could have the user specify a previously-declared chunk of code to execute first"

I think this is similar to Inform 7's approach, where rules can be referred to by name. Generally, every rule in a library has a name, but individual stories omit them unless necessary.

---

"- I'm really scraping the bottom of the barrel now...Have the user supply their own predicate that determines which generic to apply first? Which means the user basically has to implement their own customized partial-order. I really don't see that happening."

That's the approach I took in Lathe a long time ago.[1] Under my approach, the partial order is itself extensible, and it even determines the ordering of its own extensions. I quite like the expressiveness of this approach, but this expressiveness is barely meaningful for in-the-large programming: In order for the precedence rule programmer to have enough information to make judgments by, they need to require all other programmers to annotate their extensions with appropriate metadata. That or they need to maintain their own up-to-date metadata describing common libraries! Instead of programming language battles, we get framework battles full of boilerplate.

Since finishing that system, I've been pondering the "framework" conventions I'd like to use myself, and I've been trying to fold those decisions into the core of a language design.

Whereas Magpie and many other multimethod systems make a big deal about subclasses, I try to avoid any kind of programming where almost all X's do Xfoo, but some X's are also Z's so they do Zfoo instead. By the same principle, I'd avoid the [odd? _] vs. [= 1 _] case altogether. If fact, as long as I succeed in formulating in the non-overlapping designs I like, I avoid the need for precedence rules altogether... but it's still an option I keep in mind just in case.

Currently, I favor supporting extensibility by way of sealer/unsealer pairs and first-class (multi)sets.

Sealer/unsealer pairs work when each extension is an all new range of possibilities. In Arc, I'd just represent these as tagged values, and I wouldn't bother to pursue the secure encapsulation associated with sealer/unsealer pairs. In linear logic, the additive operators describe this kind of combination.

First-class (multi)sets work for when each extension is a new participant in a single computation. In Arc, a list is a good representation for this. In linear logic, the multiplicative operators describe this kind of combination.

When precedence is necessary, it can be modeled explicitly as a transformation of a (multi)set into a more structured model. I think any programmer who makes an extensible tool should carefully choose a transformation that makes sense for their own purposes--whether it's really "precedence" or something else.

---

"For general predicates, instead of hard-coding an order I'd rather use a big if/case/pattern-match that I at least know is ordered the way I wrote it."

That's my take on it, too. My examples of multi-rule functions have included factorial and exponentiation-by-squaring, but those are unrealistic. There's no reason for an extension to come interfere in that process, so it might as well be the body of a single declaration.

When I was using predicate dispatch more often, I often discovered that I could satisfy most of my use cases with just two extension annotations:

- This is the real meaning of the function, and all the other cases are just for auto-coercion of parameters into the expected format. Use this extension first. (This came up when implementing 'iso in terms of 'is, and it also makes sense for coercion functions themselves, such as 'testify.)

- This extension is the last resort. Use it last. (Maybe it throws an informative error, or maybe it returns false when everything else returns true. This also works for all-accepting coercion functions like 'testify, but I'm suspicious of this kind of design. A call to 'testify always does Xfoo, except when it does Zfoo.)

---

[1] Unfortunately, right now arc-Lathe's version of the precedence system isn't something I maintain, and Lathe.js's version has no easy-looking examples because I no longer store extensions in mutable global state. Lathe.js's "hackable" utilities are now higher-order utilities that take all their once-global dependencies as explicit parameters.

-----

1 point by Pauan 4683 days ago | link

"they need to require all other programmers to annotate their extensions with appropriate metadata."

If Nulan had multimethods, I'd probably just require programmers to add additional metadata to the object in addition to the %pattern-match gensym. But Nulan doesn't have linearization or multimethods, so I avoid that whole mess!

---

"Whereas Magpie and many other multimethod systems make a big deal about subclasses"

And I have to agree: objects are amazing, but classes and subclasses are terrible. In fact, I'm of the opinion that probably all hierarchial relationships are too restrictive. Hence Nulan's object system which is completely flat, but has pretty much all the benefits of classes/prototypes.

Basically, the behavior that is common to all objects (of a particular kind) is put into a function, and behavior that is specific to a particular object is put into the object itself. And immutability gives you wonderfully clean semantics for prototype/class behavior, without all the complication and baggage.

-----

2 points by Pauan 4683 days ago | link

Well, I was thinking in terms of Nulan, where you define new patterns with a custom "%pattern-match" property. It wouldn't be hard to add in a "%pattern-match-order" property or such, though I admit I haven't really thought through how such a property would work...

Obviously that kind of system wouldn't work in wart where predicate dispatch is based on arbitrary function calls. Hence my idea in Nulan of not allowing function mutation. I would write the above code like this:

  $def g
    1        -> (prn "one")
    (odd? X) -> (prn "odd")
    ~        -> (prn "default")

That is, it first tries 1, then (odd? X), then the ~ wildcard, which matches anything. This is equivalent to the following Arc code:

  (def g (x)
    (if (is x 1)
          (prn "one")
        (odd? x)
          (prn "odd")
        (prn "default")))

If you want to make a function extensible, you would use a property on an object, like I described here: http://arclanguage.org/item?id=16851

-----

1 point by akkartik 4683 days ago | link

Yeah, I went through a similar thought process.

One possibility is to pin a clause at the front:

  def g(x) :priority-case (x = 1)
    (prn "one")

It generalizes to your high/medium/low categories, but perhaps it's useful if you permit exactly one priority clause (newer ones overwrite it).

-----

1 point by akkartik 4684 days ago | link

I think I'm getting lost in the $def foos and the %foos. Can you show how you would make say len aware of queues?

-----

2 points by Pauan 4684 days ago | link

I could switch to Arc syntax if you like. Looking at the Arc implementation of queues...

  (def queue () (list nil nil 0))
  (def qlen (q) (q 2))

Well, Arc uses mutation, and I wouldn't, but here's how you would define it in Nulan:

  $def queue; ->
    [[] [] 0
      @{ %len: X -> (X 2) }]

And here's the same thing, but with Arc syntax[1]:

  (def queue ()
    (array nil nil 0
      @(dict %len (fn (x) (x 2)))))

The above is basically exactly the same as Arc, except it uses an array rather than a cons, and it has a custom %len property. If you don't want the different parts of the queue to be public, you could use gensyms like this:

  (w/uniq (len left right)
    (def queue ()
      (dict %len  (fn (x) (x len))
            len   0
            left  nil
            right nil)))

Rather than returning an array of 3 elements, it returns an object, which has a "len", "left", and "right" property.

In either case, the object that is returned has a %len property, which is a function that computes the length. The "len" function would then be defined like this:

  (def len (x)
    ((x %len) x))

That is, it first looks up the %len property in the object, and then calls it with itself as the first argument.

---

* [1]: You might be wondering what's going on here... well, in Nulan, a "list" is just like a JavaScript array: it's an ordinary object which uses numeric keys and has a custom %len property. In particular, that means that all the tools that work on objects work on arrays too.

The @ syntax is used for splicing. So, for instance, to merge 3 objects together, you'd say { @foo @bar @qux }. And because arrays are objects too, you can splice them.

So what we're doing is, we first create an array of 3 elements, and we then splice in a new object. This new object has a custom %len property, which overrides the %len property of the array.

Alternatively, we could have done it like this:

  (set (array nil nil 0) %len ...)

But I think it's better to use the splicing notation, especially when you use [] for arrays and {} for objects. By the way, these two are actually equivalent:

  { @[[] [] 0]
    %len ... }

  [[] [] 0
    @{ %len ... }]

In the first case we take an empty object, splice in an array, and then assign a %len property. In the second case, we take an array and splice in an object that has a %len property.

In either case, the object that is returned has the same keys and values.

-----

2 points by Pauan 4684 days ago | link

Sorry to hijack your wart thread, but... I just realized something. I wanted to allow for iterating over the keys of an object, but that caused issues, as I discussed with rocketnia earlier (http://arclanguage.org/item?id=16823)

Anyways, JavaScript gets around this problem by allowing you to define a property on an object that is "non-enumerable". But any kind of system that I add in that lets you define "non-enumerable" properties is going to be big and complicated.

Instead, I found a very simple way to create enumerable objects in a way that is completely customizable, and doesn't even need to be built-in:

  (= %keys (uniq))
  
  (def iterable-object ()
    (let keys []
      { %set  (fn (x k v)
                (pushnew k keys))
        %rem  (fn (x k)
                (pull k v))
        %keys (fn (x) keys) }))

Here's what's going on. Firstly, we got the %keys gensym, which is supposed to be a function that returns a list of keys in the object.

The function "iterable-object" returns a new object that has custom %set, %rem, and %keys properties:

%set is called when assigning a property to the object. It takes the key and pushes it into the "keys" array.

%rem is called when deleting a property from the object. It takes the key and removes it from the "keys" array.

%keys just returns the "keys" array.

Now, these objects are capable of being enumerated, which means they could be passed to "map", for instance. But here's the fun part: you can completely control which properties are enumerated and which aren't.

In this case, the object will only enumerate properties that are added or removed after the object is created. So any properties defined previously are still hidden. At least, depending on how I implement splicing and %set...

---

What the above basically means is... "every computer problem can be solved by the addition of more properties on an object" :P

-----

1 point by Pauan 4683 days ago | link

After fidgeting with the syntax, here's what I got:

  $def iterable-object ->
    [ %set  -> x k o n
              [ @x %keys -> ~ (pushnew (keys x) k) ]
      %rem  -> x k o
              [ @x %keys -> ~ (pull (keys x) k) ]
      %keys -> ~ {} ]

I actually realized that swapping [] and {} is way better, meaning that [ foo bar ] is (dict foo bar) and { foo bar } is (array foo bar). There's two reasons for this:

1) {} is closer to () than [] is, which is really nice in macros:

  $mac $accessor -> n v
    $uniq %a
      {$def n -> %a
        {{%a v} %a}}

2) I found that {} didn't stand out enough, but [] does.

---

By the way, in case you're curious about the Nulan syntax... $ is prefixed to vau/macros, which is why it's "$def" rather than "def"

-> is the function macro, which means (-> x y z ...) is equivalent to (fn (x y z) ...) in Arc

~ is the "wildcard syntax" which matches anything, just like _ in Haskell

[ foo bar ] translates to (dict foo bar), and { 1 2 3 } translates to (array 1 2 3)

@ is for splicing. Which means that [ @foo @bar @qux ] merges three objects into one. If you want to update an object with new properties, it's idiomatic to say [ @foo ... ]

Gensyms are prefixed with %

-----

1 point by Pauan 4683 days ago | link

Which, if translated into JavaScript, would look something like this...

  var iterableObject = function () {
    var a = {};
    a.set = function (x, k, o, n) {
      var a = Object.create(x);
      a.keys = function () {
        return pushnew(keys(x), k)
      }
    };
    a.rem = function (x, k, o) {
      var a = Object.create(x);
      a.keys = function () {
        return pull(keys(x), k)
      }
    };
    a.keys = function () {
      return []
    };
    return a
  }

-----

2 points by rocketnia 4683 days ago | link

"Instead, I found a very simple way to create enumerable objects in a way that is completely customizable, and doesn't even need to be built-in"

That's my approach too. Or it would be, if I actually got around to building a language out of Cairntaker. :)

-----

2 points by akkartik 4683 days ago | link

:) All very interesting, thanks. The len example was very clear. Hijack away!

-----

3 points by Pauan 4685 days ago | link

"Is there a way to make the clauses load-order independent?"

Magpie has a precedence system with its multimethods:

http://magpie-lang.org/multimethods.html#linearization

-----

2 points by akkartik 4685 days ago | link

I've fixed the load bug.

I also managed to return performance to pre-infix levels by the simple expedient of turning on compiler optimizations; on my machine it yields a 2x speedup. Startup time is now 2.6s, and tests take 35s to run.

Finally, the interpreter no longer dies when it encounters unbound vars, or when trying to invoke a non-function.

Update 9 hours later: After a little more optimization, wart will now start up in less than a second if the binary is up to date. The first run after downloading (build+run time) takes 20s. Tests run in 10s (excluding build time).

I tried to be clever and assume frequent building if running tests, but it turns out wart now does enough work that optimizations always improve build+tests time.

http://github.com/akkartik/wart/compare/5707dcc022...904be9f...

-----

1 point by akkartik 4685 days ago | link

The port of quasiquote looks great!

1) I would use alias to ensure that atom? automatically picks up extensions to cons?, etc.

2) I've been experimenting with keeping prefix <- at the top-level, just to make globals salient. But this 'idiom' is less than 48 hours old; what do you think of it?

  <- carif (cons? & car)

3) I tend to often wrap stuff in parens even when I don't have to. I only skip parens for def/mac/if/while, etc., and for imperatives like err and prn. So:

  ern "The syntax `,@" expr " is invalid"

But:

  (and list?.xs ~dotted?.xs)
  (append @cdr.args)
  (cons car.a (append cdr.a @cdr.args))

..etc.

4) This is totally fine:

  (or (atom? cdr.x) (dotted? cdr.x))

But wart's precedence rules support:

  (or atom?:cdr.x dotted?:cdr.x)

Does that seem too ugly?

5) I really like to use :else in multi-branch if.

6) There's a gotcha with multi-branch ifs: beware if you rely on paren insertion in test expressions:

  if nil
      car '(34 35)
     no nil
      cdr '(34 35)

You might expect this to return (35), but it doesn't because it's interpreted as:

  (if nil
      (car '(34 35))
      (no nil (cdr '(34 35))))

I've been guarding against this by always adding parens for things that should act as expressions, and by always explicitly adding parens in multi-branch ifs:

  (if
    no.args
      nil
    ~cdr.args
      car.args
    :else
      (let a car.args
        (if no.a
          (append @cdr.args)
          (cons car.a (append cdr.a @cdr.args)))))

I think this adds to your case against indent-sensitivity :) Multi-branch if is the one place I find myself wanting more syntax. Curlies for all their flaws really help to separate test expressions from actions in C or Java. I experimented with a 'comment token' in the past (http://arclanguage.org/item?id=16495) but that feature was lost in the move to infix.

In any case, I hope I've convinced you that I don't consider parens to be second-class. I still reach for them a lot: http://github.com/akkartik/wart/blob/c1ca2f9749/004optional_...

-----

2 points by fallintothis 4684 days ago | link

1) Interesting! I assumed alias was an alias for <-. Clever, making it define a macro.

2) I liked being able to line up the infix <- for different-length variable names assigned in succession:

  Optimize_cons   <- nil
  Optimize_append <- nil

For carif, I was using it more like an "anonymous" def where I didn't have to name the arguments. Perhaps this purpose could use its own top-level form?

By the way, I like the Global naming scheme, which I picked up from Coercions; it's a good, surprisingly readable alternative to mashing shift for GLOBAL names.

3) Ah, ern is what I was looking for! Should've looked instead of guessing names from Arc, haha.

4) A bit much for my tastes, but then so is infix. ;)

5) I noticed and tried to use it throughout. Still missed a few instances, I gather! :)

6) Yeah, I ran into that a bit. Might've missed some cases, though.

-----

1 point by akkartik 4672 days ago | link

Your vimrc.vim really isn't the right way to distribute language-specific features.

Yes. It was laziness more than ignorance. I don't fully grok the vimscript mental model, and there's always the fear that somebody else's settings might interact poorly with these. In my experience, inlining settings in the vimrc has been the most simple and reliable way to ensure they work out of the box and don't get accidentally overridden by something else. At the cost of being a bad citizen :)

But I've taken the trouble to think things through now, and this should follow best practices: http://github.com/akkartik/wart/commit/deabc891e8#diff-1. Now that it's done, it's nice to see all the rules laid out cohesively rather than as tweaks on the much more complex rules for lisp.

-----

1 point by akkartik 4684 days ago | link

I've been thinking harder about the numbered filenames. I had three goals in doing them.

a) To be able to add new files, split files, merge files, more easily. To not need to hack on the build system as much.

b) To make it dead obvious where a new user should start reading. But this only needs the early files to be numbered; past a point the precise number isn't important.

c) To not need to mess with include files as much. This overlaps with a) above.

Still thinking. I'm mulling some sort of symlink-based approach to make navigation more convenient.. Or maybe I can stop numbering past a point and ensure that the remaining files are loadable in any order. That might interact with your point 4 about generics, though.

-----

2 points by Pauan 4683 days ago | link

For what it's worth, Arc/Nu works like this. The "core" files are numbered:

  01 nu.rkt
  02 arc.arc
  03 utils.arc
  04 parameters.arc
  05 paths.arc
  06 import.arc
  07 repl.arc

Excluding parts of "03 utils.arc", the above is roughly the minimum needed to define Arc, the "import" macro, and the REPL.

They are loaded in numeric order, but that's controlled by the "arc" executable, so there's nothing stopping you from changing the order, or adding new non-numeric files[1], etc.

Everything else is put into either the "app" or "lib" folder, and none of the files in either folder are numbered.

This means the numbers are purely for documentation purposes, to help people see which parts go where. In particular, changing the numbers doesn't change the order in which things are loaded: you need to make the changes in the "arc" executable.

I think this is a good blend between not being too cumbersome, improving the visibility of order (which matters), and allowing fine-grained control of order.

---

* [1]: As an example of that, the "arc" executable also loads "lib/strings.arc" and "lib/re.arc" in addition to the above.

-----

1 point by akkartik 4683 days ago | link

Interesting. It doesn't address fallintothis's pain point though -- he still needs to remember the numeric prefix to navigate between the files.

As an experiment I've added a script that creates more mnemonic symlinks to all the files in the repo:

  $ git clone http://github.com/akkartik/wart.git
  $ cd wart
  $ git checkout 8332909aec  # Unnecessary except for visitors in the far future
  $ ./denum                  # New helper
  $ vim -O assign.wart mutate.wart  # etc.

I'm going to play with keeping these symlinks around and see how much I use them.

-----

2 points by fallintothis 4683 days ago | link

Addresses my chief complaint, at any rate. :) Couple of things:

- denum doesn't seem to work with plain /bin/sh,

  ./denum: 9: ./denum: [[: not found

though bash works.

- Slight nitpick over your perl regex:

  $ echo '(prn "hail satan")' > 666thedarklordcometh666.wart
  $ ls | grep thedarklordcometh
  666thedarklordcometh666.wart
  $ ./denum
  $ ls | grep thedarklordcometh
  666thedarklordcometh666.wart
  thedarklordcometh.wart

You probably want

  s/^[0-9]*//

- You could move the

  find . maxdepth 1 -type l -exec rm {} \;

to another script, like renum (not a 100% correct name, but close enough).

-----

2 points by fallintothis 4688 days ago | link

  $ git clone http://github.com/akkartik/wart.git
  $ cd wart # that you forgot :P
  $ git checkout f0e3d726eb
  $ ./wart
  wart>

Well, with instructions that simple, I suppose I really should actually try wart out before commenting. About time, right?

First run, but ./wart is shebanged (shebung? ha) with

  #!/usr/bin/env zsh

Mac user, I take it? :) I don't have zsh, but just sh works, for what it's worth.

With that fixed, I get to the REPL readily enough.

  7083 4912909 wart> (prn "hello")<CR>
  <CR>
  hello
  "hello"

Suggestion: wrap the memory allocation output of 002main.cc in some sort of debugging guard; maybe your makefile could have an ifdef DEBUG that would set some CFLAGS.

  7083 4913331 wart> (prn "world") # this is a comment!<CR>
  <CR>
  <CR>
  <CR>
  what<CR>
  7083 4913753 wart> (prn "what was that?")<CR>
  020lookup.cc:26 No binding for at
  Was it defined using indentation? Wart ignores indentation inside parens.

Well, then...the misadventures of rolling a parser by hand? Or maybe just a bug in the REPL, because loading doesn't seem to barf. Also,

  $ ./wart
  make: `wart_bin' is up to date.
  7083 4912909 wart> (prn "Why two Enters, by the way?")<CR>
  (prn "It's not like this sexp is evaluated.")<CR>
  Why two Enters, by the way?
  "Why two Enters, by the way?"
  7083 4913331 wart> (prn "Then there are errors down here.")<CR>
  004parenthesize.cc:36 Unbalanced )
  dying

I probably don't understand the significant-whitespace deal. On that note: if indentation's going to be important, I'd take a leaf from Python's REPL and have some leader characters on line continuations the same width as the initial prompt, so everything lines up. Something like

  wart> # return the index of x in seq, or nil if not found
  >>>>> def pos(x seq n)
  >>>>>   default n :to 0
  >>>>>   if seq
  >>>>>     if (car.seq = x)
  >>>>>       n
  >>>>>       (pos x cdr.seq n+1)

In all, feedback from my first foray: the REPL needs some work; silencing make after the first run, hiding debugging figures (unless I make DEBUG), adding a PS2, and figuring out why the parser breaks in odd cases would go a long way toward the user experience. :)

It's not much feedback, but it's all I've got at the moment. On topic: I have nothing against the syntax you've chosen for your update, in principle. Did the double-character ambiguity get resolved? If so, you could forgo <- in favor of the typical = & ==. Maybe Haskell's /= is more appealing than ~=, unless you use that for destructive division. # frees up ; for some potentially interesting infix uses (invokes imagery of unquoting, but other symbols kind of have that covered...).

-----

1 point by akkartik 4688 days ago | link

Thanks for trying it out! I'm embarrassed that your experience was so terrible. Here I am, using it everyday..

I'll get on fixing all your bug reports. In the meantime I've first added them to manual_tests: http://github.com/akkartik/wart/commit/d8336d2a22; http://github.com/akkartik/wart/commit/ac4462c94d

(Edit a day later: all these bugs should now be fixed. http://github.com/akkartik/wart/commit/e062e2a407)

---

I used to be on a mac, yes, but I don't think that's why I use zsh :) It's just an apt-get away, and it spares me having to worry about bash kludges like "$@" and whatnot. But you're right, I should just use sh to minimize friction. It's not like it's a complex shell script.

-----

2 points by fallintothis 4686 days ago | link

Random thought on my way home today: I get the name ~= for inequality, but it's awfully similar to Perl's =~ for regular expression matching. I don't know if that's a deal-breaker, though. My most-used languages don't actually have the =~ operator, but I do get a little "syntax envy" for it. So I think it's cool that there promises to be support for an infix =~, regardless.

(I started this post thinking that the RE-matching operator was ~=, but then I looked it up. Everything went better than expected.)

-----

1 point by akkartik 4686 days ago | link

Yeah, I considered the issue of perl[1]. I might use =~, but I'll probably not use =~ for regex match if I use ~= for inequality. Different things should look different, and all that. The simple option would be to just use match like in javascript. But maybe something else will present itself. Maybe just define equality between regex and string as match?

  let r (regex "a.*")
    if ("abc" = r)
      ..

[1] See the commit message at http://github.com/akkartik/wart/commit/f0e3d726eb

-----

2 points by Pauan 4686 days ago | link

I like that idea. You're testing whether the regexp is "equal to" the string. But I'd put the regexp first, like (r = "abc")

-----

2 points by Pauan 4688 days ago | link

Huh, I hadn't thought about using <- for assignment...

  foo <- X Y Z -> X + Y + Z

I'd still like or<- though, I think it stands out more than "default". But I don't think that would work with your syntax system.

-----

1 point by akkartik 4688 days ago | link

Yeah. I considered |<-, but it seemed too cryptic. It was only when I came up with default that I decided to take the leap on it all.

-----

2 points by Pauan 4688 days ago | link

  n |<- 0

I actually think that looks fine, to be honest.

-----