Arc Forumnew | comments | leaders | submitlogin
Replacing control abstractions with data abstractions
3 points by akkartik 3975 days ago | 20 comments
Fallintothis recently critiqued arc's tree utilities (http://arclanguage.org/item?id=18247) and brought to mind that I'd had similar niggles of my own.

The common lisp way to do this would be with control abstractions like subst-if, as malisper pointed out (http://arclanguage.org/item?id=18246). But like Richard Gabriel, I dislike this approach: http://www.arclanguage.org/item?id=16304.

So what to do? I want to entirely eliminate names like ontree and treewise. I can never remember the names, the order of args, and so on. Hmm, is there some way I can bring my favorite defgeneric idea to bear on the problem? Turns out there is. Let's start with the simplest way we do traversals of lists:

  arc> (each x '(1 2 3) (prn x))
  1
  2
  3
How can we extend each to support trees? A 'tree' is really just an s-expression, just like a list. So we'll just say that if the caller wraps an s-expression in type tree, we'll do an infix traversal like ontree.

  (def tree (x)
    (annotate 'tree x))

  (defmethod walk (seq f) tree   ; each is built using walk
    (let x rep.seq
      (f x)
      (unless (atom x)
        (walk (tree car.x) f)
        (walk (tree cdr.x) f))))
Now we can build ontree out of walk.

  (def ontree (f x)
    (walk (tree x) f))
Similarly, if we want to traverse just atoms like treewise (and like malisper wanted):

  (def (leaves x)
    (annotate 'leaves x))

  (defmethod walk (seq f) leaves
    (let x rep.seq
      (if (atom x)
        (f x)
        (do (walk (leaves car.x) f)
            (walk (leaves cdr.x) f)))))

  (def on-tree-leaves (f x)
    (walk (tree-leaves x) f))
I'm using arc's type system here, but I'm not creating new datatypes or 'objects'. Where variables of a user-defined datatype are created and used over potentially long lifetimes, the goal with these control-abstraction tags is to briefly wrap an underlying value in a navigation policy that the callee immediately takes off. This is an alternative to adding a new keyword arg, and I'm still mulling the pros and cons. The big advantage, I think, is that the control abstraction doesn't pollute the core definition of each primitive, but is only mentioned in methods that need it.

The definitions of ontree and its variant are just for illustration. My idea is to just use walk everywhere with appropriately specialized args. What do y'all think? https://github.com/arclanguage/anarki/commit/db172ec374. I'm considering similarly fixing find (for reclist), map (for tree-subst) and reduce (for treewise).



2 points by fallintothis 3974 days ago | link

This is an interesting approach.

I'm using arc's type system here, but I'm not creating new datatypes or 'objects'.

And here my first thought was "we've come full-circle to object-orientation". :) I mean, inasmuch as OO amounts to type dispatch (number 3 in http://paulgraham.com/reesoo.html I guess).

Where variables of a user-defined datatype are created and used over potentially long lifetimes, the goal with these control-abstraction tags is to briefly wrap an underlying value in a navigation policy that the callee immediately takes off.

I totally get this, though (despite my comment above). In effect, it's like what a more OO language might do with iteration types: Python's generators, Ruby's Enumerator, Factor's virtual sequence protocol (http://docs.factorcode.org/content/article-virtual-sequences... - honestly the most similar, because of the generic-based object system), Java's Iterable, etc.

The focus in those languages is on having a specific pattern in which iteration happens over an arbitrary sequence. Then you implement the "hooks" into this pattern in whatever specific way you need to. Which is really what's going on here, with the different walk implementations. It's just that walk is a bit coarser-grained than the others. In Python/Ruby/Java, you implement a specific method that just returns an object of a particular type---then that object implements an interface saying "hey, primitive iteration knows what to do with me". E.g.,

  Python 2.7.3 (default, Jan  2 2013, 16:53:07)
  [GCC 4.7.2] on linux2
  Type "help", "copyright", "credits" or "license" for more information.
  >>> class c:
  ...     def __init__(self): pass
  ...     def __iter__(self): return (x for x in range(5))
  ...
  >>> for x in c(): print x
  ...
  0
  1
  2
  3
  4
Factor's a bit different, and I think we might learn something from it. Like other CLOS-alikes, its object system isn't based on classes that own methods. So rather than returning a specific "iterable" object, the sequence iteration functions are written to use a handful of very specific methods: http://docs.factorcode.org/content/article-sequence-protocol... Then, any object that implements those methods works with map, reduce, each, and the whole gang: http://docs.factorcode.org/content/article-sequences-combina...

There's a difference in granularity. In Factor, the required methods are more primitive than walk: just length and nth. Then map/each/etc. simply loop over integers, indexing into the sequences. So, where Arc would have some ad hoc thing like

  (def map (f seq)
    (if (isa seq 'string)
        (map-over-strings ...)
        (map-over-conses ...)))
Factor would be more like

  ; rely on (len seq) & (seq i) being well-defined

  (def map (f seq)
    (for i 0 (len seq)
      (let x (seq i)
        ...)))
With walk, you have to implement the whole iteration shebang---and even then it's only used in each right now, not the whole suite of sequence functions.

The obvious problem with Factor's exact approach as applied to Arc is Arc's reliance on linked lists. Each index access is O(n), which would turn any iteration (map, each, whatever) over a linked list into an O(n^2) affair. So just defining something like walk might ultimately be preferable; and you could still surely implement map/reduce/keep/etc. in terms of it. (Even the string versions could be reworked to walk over the string characters. And for the love of everything, do away with general functions that hard-code recursion on cdrs, like vanilla Arc's counts! I see Anarki's already done that, but still.)

That whole digression aside, I think the actual pattern you demonstrate here kind of gives me some warm fuzzies. With the realization that ontree & company are just higher-order functions like map at heart, it liberates us from writing a million names for the same actual operation. To keep harping on Factor (it's the most Lisp-like non-Lisp that I know...), the same thing happens, where instead of a reverse-each function, you just have

  IN: scratchpad { 1 2 3 } <reversed> [ . ] each
  3
  2
  1
and instead of using some each-integer looping function (though that does exist, it's an implementation detail), you just have

  IN: scratchpad 5 iota [ . ] each
  0
  1
  2
  3
  4
I'm liking this idea even more now, since I've realized that it has parallels with my feelings about the Scheme standard using <, char-ci<?, char<?, string-ci<?, and string<? instead of just <. It's just that lists vs trees was a more subtle distinction.

-----

1 point by akkartik 3974 days ago | link

Thanks for the parallels with Factor! I love Factor too, though I'm less knowledgeable than you.

I'm going to keep an eye out for lower-level primitives to genericize.

-----

2 points by akkartik 3975 days ago | link

I just did a bunch of cleanup on anarki:

1. defgeneric now dispatches on a condition rather than on a specific type. It's a little more verbose but more flexible. I'd already done this in wart almost 3 years ago (http://arclanguage.org/item?id=13790), just been too lazy to update anarki.

2. I've gotten rid of defgeneric; any def can now be generic. This is nice because I don't have to do the whole dance of bootstrap def then unset then defgeneric then defmethods. There's no more code that will get silently overridden.

3. I've merged defmethod and extend into a single name since they now do essentially the same thing. For the new name I'm trying out defextend, but I'm not attached to it.

4. I deleted a couple of libraries that had suffered bitrot or were obsoleted by newer/simpler ideas. They haven't been touched for several years, but feel free to bring them back.

5. As outlined above I've deleted ontree. Use each or walk instead.

6. I've deleted treewise and replaced its calls with reduce.

7. I've deleted tree-subst and replaced its calls with subst, which in turn relies on extensions to map.

All this feels a lot more coherent: https://github.com/arclanguage/anarki/commit/5f26254e7d#diff...

The changes: https://github.com/arclanguage/anarki/compare/1b335cd6ce...5...

-----

1 point by malisper 3975 days ago | link

Even though I think this is a good idea, I'm pretty sure there is better way to do it. I feel like your solution is only a solution to a very specific problem. Your solution is almost the exact same as keywords, only its not. You are just adding extra information into the function call (the type of argument). Instead we could use keywords for a solution to this, and as a solution for many other problems for which this kind of scheme wouldn't work.

I do like the idea of replacing a ton of functions with functions that do different things depending on their arguments.

-----

1 point by akkartik 3975 days ago | link

Are you thinking of some other way to define walk while postponing support for trees, leaves, etc.?

-----

1 point by malisper 3975 days ago | link

Could we do something analogous to defmethod but goes based off of a keyword?

  (def walk (seq) :tree
    ...)

  (def walk (seq) :leaves
    ...)
And then call it like:

  (walk x :tree)

-----

1 point by akkartik 3974 days ago | link

I thought about it some more. Here's why I think the wrapping approach is superior to a new arg. Let's say you define a function on lists:

  (def foo (x)
    (... x))
Later you want to use foo, but traversing nested lists like it's a tree. Ok, so we'll add a new version:

  (defextend foo (x y) (is y :tree)
    (... x y))
But now calls to the base version will stop working; defextend passes both args to the base version which is only expecting one arg. To make things work you have to modify the base case, and teach it about an arg it doesn't need to know about. That seems unnecessarily ugly.

Another problem is the need to sequence args. If you create a new version with yet another arg:

  (defextend foo (x z) (...)
    (... x z))
Now you have to worry about the second arg being used for wildly different uses, possibly interfering with each other and causing bugs, or having to provide unnecessary args:

  (foo a nil :widget)  ; don't traverse a like a tree
Edit: ah, I just thought of one counter-argument. If you have multiple tree arguments you have to wrap each of them in my approach. This is more apparent when trying to support dotted lists in map (fallintothis's original complaint: http://arclanguage.org/item?id=18247)

  (map + (dotted '(1 2 3)) (dotted '(1 2 . 3)))
---

Now, arc could be more flexible. My wart language (http://akkartik.name/post/wart) is more like javascript: excess args are silently ignored, and missing args are silently passed nil. Also, it has keyword args, so different versions might be called as:

  (foo a :traverse-as tree)
  (foo a :like widget)       ; don't traverse like a tree
In such a situation the difference seems more minor.

-----

2 points by malisper 3974 days ago | link

The thing is, I don't want a new arg. I think we would be better of using keyword arguments, similar to ones as you mentioned in wart[1]. The reason I'm making such a big deal out of this is why create a new solution (tagging the arguments as they are passed in) when we can already can do this with keywords which are much more flexible. With keywords we can call them with anything we want: a list, a symbol, or even some sort of complex data structure.

  (foo :a 'tree)
  (bar :b '(1 2 3))
  (baz :c (obj a 'b c 'd))
We can use keywords to pass in many different kinds of arguments. If we were to just tag the args, we could pretty much only pass in different types, not much else.

  (foo (tree x)) ; how would you change this from a type to a number?
We can also implement them in such a way that when a function is called with extra keyword args, it won't change the behavior of the program (we would probably want it to log some kind warning). This way when we extend foo, it doesn't matter if we pass the original the keyword arg. As far as the original foo knows, the keyword arg doesn't even exist.

Sorry if I confused you. I don't want extra args, I want keyword args.

[1] https://github.com/akkartik/wart/blob/5d0a90782f7b4b7b8c4589...

-----

1 point by akkartik 3974 days ago | link

That makes sense! I've long been a proponent of keyword args as you can see. My love for them goes back all the way to one of my first posts here: http://arclanguage.org/item?id=10692. I think I've implemented them three times so far. I even have a version of arc with them: http://github.com/akkartik/arc. But I haven't updated it lately. I'll spend a few hours and merge in all the new stuff in anarki over the past year.

However, I highly recommend Richard Gabriel's critique of what he calls control abstractions: http://dreamsongs.com/Files/PatternsOfSoftware.pdf; pg 28-30. Hmm, I should go reread it as well.

-----

2 points by malisper 3974 days ago | link

That's great! I just googled "Paul Graham arc no keywords" and found in one of his early writings on arc, he had a plan to replace keywords with something called "get parameters".

http://paulgraham.com/arcll1.html

-----

1 point by akkartik 3974 days ago | link

Ok, I've updated http://github.com/akkartik/arc. It was a useful exercise, because it turns out it has more tests than anarki, and I found several bugs in my work yesterday :) I've copied the tests to anarki as well.

-----

1 point by akkartik 3973 days ago | link

I tried to build a new optional keyword arg for each to mimic ontree, but ran into some problems. Consider the definition of each:

  (mac each (var expr . body)
    `(walk ,expr (fn (,var) ,@body)))
It doesn't really need to know about a new arg, but there's no way to pass it in automatically because there's already a rest arg: body. Ok, so let's try that:

  (mac each (var expr ? as nil . body)
    `(walk ,expr :as as (fn (,var) ,@body)))
The problem with this is that you need to explicitly provide the optional arg, or it gets taken out of body.

  arc> (prn:macex1:quote (each x '(1 2 3) x))
  (walk '(1 2 3) :as x (fn (x)))               ; wrong

  arc> (prn:macex1:quote (each x '(1 2 3) :body x))
  (walk '(1 2 3) :as () (fn (x) x))
I had this old idea that rest params should take precedence over optionals, but it turns out to have fallen by the wayside: http://www.arclanguage.org/item?id=13083. Time to bring it back here.

-----

2 points by malisper 3973 days ago | link

Well, the problem is you are trying to make keyword args the same as optional args. It might be better if you make make keyword args and optional args two completely separate kinds of arguments (similar to how common lisp has both &optional and &key). This way you would define each as:

  (def each (var expr : as nil . body) ; I used : to represent keyword args
    `(walk ,expr :as as (fn (,var) ,@body)))
This way calling each like

  (each x '(1 2 3) x)
will expand correctly because keywords only take a value when they are passed in explicitly. It would have to be called as

  (each x '(1 2 3) :as x)
to get the error you had. If you need to define a function that takes both optional args and keyword args:

  (def foo (a b c ? g h i : j k l . rest)
     ...)
This way a-c are required, g-i are optional, j-l are keyword, and rest is everything else.

As for the problem you had with needing to redefine each (it is also mentioned by rocketnia in the post you linked[1]), each is just an interface for walk, so if a new feature is added to walk, each need to be redesigned. There might be some way that we can define some sort of macro, interface, which can be used to create a function/macro which is a very thin layer over another function/macro. As for how interface is to be designed, I'm not sure.

[1] http://www.arclanguage.org/item?id=13090

-----

1 point by akkartik 3973 days ago | link

Well, it seems more general to let any arg be acceptable as a keyword.

rocketnia's concern I addressed by passing through unknown keyword args unchanged:

  (def foo (a b)
    (list a b))

  (def bar args
    (apply foo args))

  arc> (bar 3 :a 4)
  (4 3)
But that doesn't work in this case because each needs to treat as specially. I don't think that's avoidable.

We _could_ include a new category of 'required' keyword args like you suggest, yes. Let me know if you think of a situation where just giving rest args precedence isn't enough. Things are already pretty complicated with rest, destructured, optional, and keyword args, so I'm reluctant to add another delimiter and category.

-----

2 points by rocketnia 3970 days ago | link

"rocketnia's concern I addressed by passing through unknown keyword args unchanged"

The reason I'm okay with this approach is because it localizes the complexity to each function. This way I don't dislike the language; I just dislike the idioms currently being used in it.

Consider the use of flat text formats for shell scripts, HTTP headers, CSS properties, XML attributes, etc. They're usually convenient enough to type out manually, and that also makes them convenient to generate in simple cases, but writing a parser for them can be tough due to the same syntactic sugars that made them convenient to generate in the first place. Writing a full-featured generator or parser can also be tough, because sometimes future versions of the text format may allow it to encode more information.

With keyword arguments, the syntax for a function call is more complicated and sugary, putting programmers like me in the position of wondering what kind of headaches they should go through to parse or generate this format. If the headaches are unclear, combining programs becomes hard, because each developer might opt for different headaches, at which point I think they overlap in the form of complex glue code. :)

-----

2 points by akkartik 3970 days ago | link

I have a greatly-enhanced respect for this perspective over the past month. Look at my travails over the past day trying to get optional, keyword and destructured args to play well together: https://github.com/akkartik/arc/compare/342ebf57a8...e347973.... Or my travails over the past month trying to build macex in wart in the presence of optional, keyword and destructured args AND quoted AND aliased params AND already-eval'd args[1] AND incompletely-eval'd args[2]: https://github.com/akkartik/wart/compare/30025ee25c...cabaa0.... Clearly there's some point at which I jump the shark :)

[1] related to apply for macros: http://arclanguage.org/item?id=16378

[2] related to partial eval: https://github.com/akkartik/wart/commit/8239ed9d21. I've been struggling intermittently with partial eval for two years now. This now year-long attempt might well be utterly useless.

---

Perhaps the easiest to understand and most serious extant example of the knots I end up tying myself into:

  arc> (iso :x :x)
  nil
This is because x is one of the params of iso. Even more serious, this happens even if you try to compare a variable that evaluates to :x. Or a variable that evaluates to a large structure containing :x somewhere in it. Ugh! The reason for this is that I'm extracting keyword args at function-call time inside each function. This:

  (def foo (x) x)
compiles down to something like:

  (define (foo allargs)
    (let ((keyword-args  (extract-keywords allargs '(x)))
      ..))
It took me three years to notice yesterday that I'm extracting keywords from post-evaluated arguments. Ugh! Urrrgggh!

Weird thing is, even wart has this problem. Even though I would superficially seem to be extracting keyword args pre-evaluation. I still need to debug this. Or perhaps I shouldn't debug this. Perhaps I should just throw up my hands and delete all my repos and become a hermit.

-----

1 point by akkartik 3970 days ago | link

Ok, my repo now supports both formats:

  (each x '(1 2 (3 4)) :like 'code prn.x)

  (each x (code '(1 2 (3 4))) prn.x)
We'll see over time if one of the options is superior.

-----

1 point by malisper 3975 days ago | link

We should be able to specify the type of the function as part of the function. We shouldn't need to specify it in the arguments. I just think of it as we are walking over a tree as opposed to calling walk on an object that is a tree.

What I'm trying to say is the keyword arg shouldn't apply to a specific argument. It should apply to the function call as a whole.

-----

1 point by akkartik 3975 days ago | link

Hmm, maybe. But then it's not clear what arg the keyword args apply to. Compare

  (walk x f :tree)
and

  (walk (tree x) f)

-----

1 point by akkartik 3974 days ago | link

As an example, consider all the ways you can traverse an s-expr:

  arc> (each x '(1 2 (3 4)) prn.x)
  1
  2
  (3 4)

  arc> (each x (tree '(1 2 (3 4))) prn.x)
  (1 2 (3 4))
  1
  (2 (3 4))
  2
  ((3 4))
  (3 4)
  3
  (4)
  4
  nil
  nil

  arc> (each x (leaves '(1 2 (3 4))) prn.x)
  1
  2
  3
  4
  nil
  nil

  arc> (each x (code '(1 2 (3 4))) prn.x)
  (1 2 (3 4))
  1
  2
  (3 4)
  3
  4
The last addresses fallintothis's bug with ifs-with-do: http://arclanguage.org/item?id=18247. See https://github.com/arclanguage/anarki/commit/74ba10b700.

-----