Arc Forumnew | comments | leaders | submitlogin
Building generics in arc
3 points by akkartik 5091 days ago | 6 comments
Here's how iso looked in arc3:

  (def iso (x y)
    (or (is x y)
        (and (acons x)
             (acons y)
             (iso (car x) (car y))
             (iso (cdr x) (cdr y)))))
A few months ago I extended it to support tables:

  (def iso (x y)
    (or (is x y)
        (and (acons x)
             (acons y)
             (iso (car x) (car y))
             (iso (cdr x) (cdr y)))
        (and (isa x 'table)
             (isa y 'table)
             (iso (len:keys x) (len:keys y))
             (all
               (fn(pair)
                 (let (k v) pair
                   (iso y.k v)))
               tablist.x))))
But that's hacky. The next time I wanted to add a new type I decided to face arc's hacky lack of primitive support for queues, etc. What I want is to break that body for iso up between a core in defgeneric, and a type-specific body in defmethod that is stored in a vtable. Something like this:

  (= vtables* (table))
  (def iso(x y)
    (or (is x y)
        (and (is type.x type.y)
             (if acons.x
               (and (iso car.x car.y)
                    (iso cdr.x cdr.y))
               (aif (vtables*!iso type.x)
                  (it x y)
                  (let p (pickles* type.x)
                    (iso p.x p.y))))))
Now we can extend a generic either with a method in vtables, or with a transformer function in pickles.

But how do we divide up that body between what's common across all generic functions, and what's unique to iso's body?

Ah, reordering helps:

  (def iso(x y)
    (aif (vtables*!iso type.x)
      (it x y)
      (aif (pickles* type.x)
        (iso it.x it.y)
        (or (is x y)
            (and acons.x
                 acons.y
                 (iso car.x car.y)
                 (iso cdr.x cdr.y))))))
Now I can write the iso-specific parts in the lower half as:

  (defgeneric iso(x y)
    (or (is x y)
        (and (acons x)
             (acons y)
             (iso car.x car.y)
             (iso cdr.x cdr.y))))
Which is identical to the original iso!

Now the common part is at the top. So far it's using iso's args, and different generics could have different args. Make it independent, dispatching on just the type of the first arg:

  (def iso args
    (aif (vtables*!iso (type car.args))
      (apply it args)
      (aif (pickles* (type car.args)
        (apply iso (map it args))
        (let (x y) args
          (or (is x y)
              (and acons.x
                   acons.y
                   (iso car.x car.y)
                   (iso cdr.x cdr.y))))))
Now we're ready to implement defgeneric using vtables.

  (= vtables* (table))
  (mac defgeneric(name args . body)
    `(def ,name allargs
      (aif (aand (vtables* ',name) (it (type car.allargs)))
        (apply it allargs)
        (aif (pickles* (type car.allargs))
          (apply ,name (map it allargs))
          (let ,args allargs
            ,@body)))))
Adding to the vtable for tables:

  (= vtables*!iso!table
     (fn(x y)
       (and (isa x 'table)
            (isa y 'table)
            (is (len keys.x) (len keys.y))
            (all
              (fn(pair)
                (let (k v) pair
                  (iso y.k v)))
              tablist.x))))
The key assumption is that each defmethod can handle args of any type without raising errors or crashing. Hence the redundant check that x is a table.

With defmethod sugar that becomes:

  (defmethod iso table (x y)
    (and (isa x 'table)
         (isa y 'table)
         (is (len keys.x) (len keys.y))
         (all
           (fn(pair)
             (let (k v) pair
               (iso y.k v)))
           tablist.x)))
So defmethod translates to:

  (mac defmethod(name type args . body)
    `(= ((vtables* ',name) ',type)
        (fn ,args
          ,@body)))
Let's now try a second example: iso for queues, this time using pickle. Let's assume queues are now tagged as 'queue.

  (= pickles*!queue
     qlist)
Translated:

  (pickle queue qlist)
So the transform is:

  (mac pickle(type expr)
    `(= (pickles* ',type) ,expr))
Cool!

(One bug is that vtables for iso is not initialized anyplace. Exercise for the reader.)

---

How does this defgeneric compare with aw's extend (http://awwx.ws/extend0)?

Pro: defgeneric scales up to large numbers of extensions because of the hash-table lookup of vtables. extend however must string conditionals linearly, which is less efficient.

Con: extend can distinguish arbitrary properties because test is passed in as a parameter rather than assumed to be type. We can fix this if we need to, with a programmable function that generates a (potentially non-type) symbol.



2 points by rocketnia 5090 days ago | link

For some friendly comparison, here's a Lathe approach, as well as a three-way pro/con list.

(As I was just about to post this, I ran into a bug in Lathe itself, and Jarc's debugger helped me track it down. Go Jarc!)

  ; We're doing this REPL-style. If this were a library, we'd wrap the
  ; whole file in (packed:using-rels-as ...) and qualify all the names
  ; we define. In an application file, either approach works, and the
  ; "packed:" and name-qualification can be dropped.
  (use-rels-as mr (+ lathe-dir* "multival/multirule.arc")
               oc (+ lathe-dir* "multival/order-contribs.arc"))
  
  ; These are utility functions that we'll use to make multival
  ; contribs with certain labels have high or low precedence.
  (let get-predicates (fn (name labels)
                        (map (fn (label)
                               [and (is !name._ name)
                                    (is !label._ label)])
                             labels))
    (def prec-labels-first (multival-name . label-names)
      (let predicates (do.get-predicates multival-name label-names)
        (apply oc.prec predicates)))
    (def prec-labels-last (multival-name . label-names)
      (let predicates (do.get-predicates multival-name label-names)
        (apply oc.prec [~some ._ predicates] predicates)))
    )
  
  ; Note that 'type is the parameter here, and 'default is a label
  ; uniquely determining this contrib among all pickle-for contribs.
  (mr:rule pickle-for type default
    nil)
  
  ; We want the default pickle-for to have the lowest precedence since
  ; it never fails.
  (prec-labels-last 'pickle-for 'default)
  
  ; We'll be labeling all our custom-iso2 contribs, even though we
  ; don't refer to all those labels. That makes it easier to customize
  ; these rules by pasting new versions into the REPL or into the
  ; application code, and it makes it possible for precedence rules to
  ; identify these rules so that, say, tables with a 'type field can
  ; have a behavior that overrides the usual table behavior.
  ;
  ; This is very similar to the Inform 7 coding guideline that all rules
  ; in libraries should be named so that they can be given exceptions.
  
  (mr:rule custom-iso2 (a b) table
    (unless (all [isa _ 'table] (list a b))
      (fail "The parameters weren't all tables."))
    (and (is len.a len.b)
         (all [custom-iso2 _.1 (do.b _.0)] tablist.a)))
  
  (mr:rule custom-iso2 (a b) cons
    (unless (and acons.a acons.b)
      (fail "The parameters weren't all cons cells."))
    (and (custom-iso2 car.a car.b) (custom-iso2 cdr.a cdr.b)))
  
  ; In order to ensure commutativity, we're going to calculate pickles
  ; for both values rather than just using the first value's pickles for
  ; both. To make an effort to ensure this doesn't make a difference,
  ; we're going to compare those pickles. Note that in the generalized
  ; 'custom-iso below, when there are ten arguments, we'll end up
  ; calculating the first argument's pickle ten times. Each of these is
  ; a potential blow to efficiency, so this implementation may miss the
  ; point of pickles altogether.
  (mr:rule custom-iso2 (a b) pickle
    (with (pa pickle-for.a pb pickle-for.b)
      (if (~custom-iso2 pa pb)
        (fail "The pickles didn't match.")
          pa
        (custom-iso2 pa.a pb.b)
        (fail "The parameters didn't have pickles."))))
  
  (mr:rule custom-iso2 (a b) is
    (unless (is a b)
      (fail "The parameters weren't reference-identical."))
    t)
  
  (mr:rule custom-iso2 (a b) default
    nil)
  
  ; We want the default custom-iso2 to have the lowest precedence.
  (prec-labels-last 'custom-iso2 'default)
  
  ; It's probably more efficient to test for (is a b) first, so we'll
  ; do that.
  (prec-labels-first 'custom-iso2 'is)
  
  (def custom-iso args
    (or no.args
        (let first-arg car.args
          (all [custom-iso2 first-arg _] cdr.args))))
  
  (def fn-pickle (type converter)
    (mr:rule pickle-for value      ; no label
      (unless (isa value type)
        (fail:+ "The parameter wasn't a " type "."))
      converter))
  
  (mac pickle (type converter)
    `(fn-pickle ',type ,converter))
Pros for your approach only:

- Yours is still the only one with hash-table lookup. Since that seems to be the reason behind your approach (versus 'extend) in the first place, 'extend and this approach just don't compete. Then again, the number of extensions is likely to be small and constant during the run of a single application, and when that's true, I think the complexity issue is moot--that it's more of a numeric issue.

Cons for your approach only:

- Among your approach, this approach, and 'extend, yours is the only one that can't "distinguish arbitrary properties," as you say. Even to fix this like you suggest, by making the hash-to-a-symbol function customizable, you only get to have one function, and any one function isn't likely to support every conceivable property at once, unless it's Turing-complete or something or it's extensible itself.

- If you try applying your approach to a multimethod with arguments of multiple types, you'll end up pickling all the arguments using the first argument's pickle, and that might not be what you want.

Pros for this approach only:

- Precedence rules can be defined among extensions. Then again, this doesn't really help unless you're already using arbitrary properties; if you only have one extension per Arc type, precedence doesn't matter much.

- This may be slightly more efficient, since the 'is test is done only once and with high precedence. It would be awkward to have both high-precedence and low-precedence default behavior under your approach or the 'extend approach, and while your approach supports the possibility that every computation-intensive extension will call 'is first itself, that makes those extensions a bit more long-winded.

- This approach defines 'custom-iso rather than replacing 'iso outright, which means that code relying on 'iso to fail for tables can still work. That said, it's really simple to apply your approach or the 'extend approach to a 'custom-iso function, and in case you think this is an anti-feature, it's just as simple to apply my approach to 'iso itself.

Cons for this approach only:

- This approach is much longer than yours or 'extend, and it relies on a fair amount of other library code to boot.

Pros for 'extend only:

- It's a natural progression of the (let old-foo foo (def foo ...)) pattern that's rather fundamental to Arc's hackability. IMO, this makes 'extend inherently easy to get the hang of as an abbreviation, even if it has additional features that make it a less lightweight concept overall, such as the ability to delete extensions.

Cons for 'extend only:

- Among these three approaches, the 'extend approach is the one for which it's most important to pay attention to the order in which extensions occur in the code (or on the REPL) for efficiency and precedence purposes.

-----

4 points by aw 5086 days ago | link

Among these three approaches, the 'extend approach is the one for which it's most important to pay attention to the order in which extensions occur in the code (or on the REPL) for efficiency and precedence purposes.

Here's my personal take on when to use 'extend vs. something else:

If there's a facility that does what you need, use it, but if there isn't, use 'extend ^_^

'extend is a hack, or rather, a tool for hacking. I mean that in a good way, I use it all the time. But pile too many hacks on top of each other and you get a mess.

So if you're extending a function in a way that can be done with generics, that's good, because now you can do your definitions in an independent order and you're not wondering what hacks are going to be messing up your hack.

-----

1 point by akkartik 5090 days ago | link

Nice exagesis, thanks.

"Even to fix this like you suggest, by making the hash-to-a-symbol function customizable, you only get to have one function"

I'm not sure I follow. I was thinking, for example, that defgeneric would take an optional arg that is type by default. So for example if you want an unserialize generic function, where all serialized types are lists, the function you pass in may be car. Or am I misunderstanding your objection?

-----

2 points by rocketnia 5087 days ago | link

Sure, that's what I thought you were saying. ^_^ What I'm trying to say is that, no matter whether the argument is 'type or 'car or something else, once you decide what it is, you've limited the cases that an extension can test for. Someone loading a library and wanting to write an extension for it is limited based on an assumption the library creator made, in a way that they aren't limited when using 'extend.

I'm finding your pickle idea sort of intriguing. What if there were only a single pickle function (or one per defgeneric), which was extensible, and then selecting a method worked by trying to convert the arguments over as little pickle-distance as possible before getting to something that could be matched to a parameter list? Hmm, what I mean is something like this:

  To see if a multimethod can be called without pickling:
    If the actual arguments match exactly one parameter list, yes.
    If they match none, no.
    If they match two or more parameter lists, there's an error.

  To call a multimethod:
    For N = 0 to infinity...
      For each choice of N argument indices, allowing duplicates...
        The pickled arguments are the arguments after pickling each
           chosen index, pickling multiply chosen indices multiple
           times.
        If any of the arguments couldn't be pickled that many times, try
           the next choice of N indices instead.
        If the multimethod can be called with the pickled arguments
           without pickling further, keep track of these pickled
           arguments.
      If we had no legal choices during our loop, there's an error.
      If we've kept track of more than one callable pickle combination,
        there's an error.
      If we've kept track of exactly one, make the call and stop.
This is rather horribly inefficient, but hopefully it gets my idea across.

Despite the fact that I'm bringing up this idea, I'm not sure I agree with it. Who's to say that two picklings of the first argument is a more extreme case than one pickling of the second argument? It's just a thought, I guess.

-----

1 point by rocketnia 5090 days ago | link

(One bug is that vtables for iso is not initialized anyplace. Exercise for the reader.)

If I may:

   (mac defmethod(name type args . body)
  -  `(= ((vtables* ',name) ',type)
  +  `(= ((or= (vtables* ',name) (table)) ',type)
         (fn ,args
           ,@body)))
Does that work for you? ^_^

-----

1 point by akkartik 5090 days ago | link

I hadn't thought of that :)

The drawback is that it's every defmethod's responsibility to make sure it sets the vtable entry if necessary.

I was thinking more of wrapping the implementation of either defgeneric or defmethod in a `(do (or= ..) ...)

-----