Arc Forumnew | comments | leaders | submitlogin
Suggestion: constructors
7 points by rkts 4374 days ago | 14 comments
Often we expect an expression to evaluate to one of a finite set of possibilities. E.g. a list is either a cons or empty. The result of a search is either an object or a notice of failure. A tree is a red node, a black node, or empty. Etc.

Since in many (perhaps most) cases the set is "either a value or another thing," we can represent that "other thing" with nil. But this is not a general solution, and it has some problems. For instance, a search function that returns nil on failure is ambiguous if the collection being searched might contain nil. And similarly a function parameter that defaults to nil is ambiguous if the caller might specify nil.

What we need is a way to take a value and attach to it some kind of tag to distinguish it from other values. Arc offers many ways to do this; what's missing is (a) a conventional way of representing it, and (b) a concise syntax that abstracts the details of the representation.

I propose the following macro:

  (mac defcon (name . args)
    (let gargs (map [uniq] args)
      `(def ,name ,gargs
         (annotate ',name (list ,@gargs)))))
defcon generates a function that collects its arguments (the number of which is fixed by the number of forms after name) into a list and tags them with annotate. I'm calling such functions 'constructors' because they are analogous to data constructors in ML; suggestions for a better name are welcome.

We can now define a find function that will always distinguish between success and failure.

  (defcon have x)
  (defcon none)

  (def myfind (f xs)
    (if (no xs)
      (if (f (car xs))
        (have (car xs))
        (myfind f (cdr xs)))))
To use myfind, we need a match macro.

  (def foldr (f init xs)
    (if (no xs)
      (f (car xs) (foldr f init (cdr xs)))))

  (mac match (expr . cases)
    (w/uniq (gexpr)
      `(let ,gexpr ,expr
         ,(foldr (fn ((p e) rest)
                   (if (alist p)
                     `(if (isa ,gexpr ',(car p))
                        (let ,(cdr p) (rep ,gexpr) ,e)
                     (isa p 'sym)
                       `(let ,p ,gexpr ,e)
                       (err "match: malformed pattern:" p)))
                 `(err "match failure on" ,gexpr)
                 (pair cases)))))
We can now safely search for a list of length < n, something we couldn't do with the old find.

  (= lists '((1 2 3 4) (1 2 3) (1 2) (1) ()))

  (= shortlist [< (len _) 3])

  arc> (match (myfind shortlist lists)
         (have lst) `(the-first-short-list-is ,lst)
         (none)     '(there-is-no-short-list))
  (the-first-short-list-is (1 2))
This can be further simplified with the equivalent of aif:

  (mac ifhave (expr thenexpr . elseexpr)
    `(match ,expr
       (have it) ,thenexpr
       (none)    ,(if (cdr elseexpr)
                    `(ifhave ,@elseexpr)
                    (car elseexpr))
       x         (err "ifhave: input is not a have or a none:" x)))

  arc> (ifhave (myfind shortlist lists)
         `(the-first-short-list-is ,it)
  (the-first-short-list-is (1 2))
Notice that this is just as concise as the existing Arc idiom, but it is more general and robust.

We can use this for optional parameters too. If a parameter lacks a default value, it will be (have x) when a value x is passed, or (none) otherwise. Supplying a default value y for a parameter x will then be equivalent to omitting the default value and wrapping the function body in (let x (ifhave x it y) ...).

What do you guys think?

3 points by kennytilton 4374 days ago | link

I did not understand why there was a problem with find when nil appears in the searched list, especially in your example where it appeared last. Sorry if it is obvious, have not had my coffee yet. In your example, even if we move nil to the first position, isn't nil the first short list? It's length is zero. If you meant "nils don't count!", my question is still why not? because in my experience (13 years) they always do. But if you think this is a problem, then the only problem is that you did not say what you wanted in your test:

  (find [< 0 (len _) 3)] lists)
If you need something more general to exclude nils:

  (find [when _ (even (len _))] lists)
I will pause at this point to observe that you said your sample ifhave usage was "as concise as the existing Arc" but I missed the existing Arc, which might have helped me understand where you see a problem.

As for optional parameters, there you have a case! I almost never need it in CL, but one can do what in arc would be:

  (def my-fun ((o my-arg nil my-arg-supplied))
      (if my-arg (yadda 1)
          my-arg-supplied (yadda 'the-client-said-no)
          (yadda 'default-yadda-please)))


4 points by cchooper 4374 days ago | link

I'm think the problem he's referring to is:

  (find [< (len _) 3] '((1 2 3 4) (5 6 7 8)))
  => nil

  (find [< (len _) 3] '(nil (1 2 3 4) (5 6 7 8))
  => nil
Does Arc have a simple solution to this that I'm missing (other than reimplementing find that is)?


3 points by kennytilton 4374 days ago | link

Oh. We had the same question in re hash tables where one might want to distinguish between a key stored with the value nil and a missing key, not possible with Arc since it does not store the key when the value is nil, and pg replied that not being able to distinguish the two was not an issue in practice. Perhaps the same "show us the use case" rule applies here?

Otherwise, just use mem?


1 point by rkts 4374 days ago | link

As I said, there are plenty of half-baked solutions to this problem, and mem is an example; it distinguishes two possibilities by returning either a cons or nil. What I'm looking for is a general abstraction that obviates the need for such kludges.


5 points by kennytilton 4374 days ago | link

You say "kludge", I (and apparently pg) say "you do not have a use case, this is an imagined problem" and one should not build languages around imagined problems, you end up with Java. I asked for a use case and you have not supplied one. I am not saying one does not exist, I am saying any discussion of a "fix" is meaningless until we have a target.

Consider again the very similar case of Arc tables being unable to differentiate "key not found" from "key associated with nil". I was halfway thru a brilliant retort to pg on this with an example from my own real live working code (where I used CL's ability to differentiate the cases) when I realized Doh! I totally did not need the capability (and man was I happy I realized that before trying to school pg <g>).

The danger in posting a use case is that all the ones you offer will turn out to be better handled some other way. Then you will know why your bloated fix is not in Arc.


4 points by rkts 4374 days ago | link

I knew someone was going to ask for a use case, which is why I gave an example that (I thought) was so basic as not to require one: find a list of a given length. If you can't solve such a simple problem simply, that indicates to me a fundamental weakness.

Besides the kludginess of mem, consider how easy it is to screw up with find. How do you verify that a list doesn't contain any nils? Without a type system to enforce such invariants you get the worst kind of bugs: those that are manifest only once in a blue moon and present no obvious diagnosis.

That similar problems arise with other uses of nil suggests to me that we're missing an abstraction.

Let me turn this around. Is there any code that would become more verbose with my suggestion? You may think it only helps in weird edge cases (which I dispute), but what do we lose by incorporating it?


4 points by kennytilton 4374 days ago | link

<cough> No, the /code/ you offered was looking for a list with less than a given length with the understanding that nils might be in the list. The test for a list without nils is:

   (all idfn lst)
Is it a kluge to state the test positively? Actually, negative tests are understood to be harder on the cognitive system to get right.

Meanwhile, are you sure (member nil lst) is a kluge? It sure looks readable, and it sure does work. What exactly bothers you about that?

Meanwhile, a use case means your are not just running code samples you typed in at the repl, it means you are trying to solve a problem using a computer and something came up. So here we are after iterating over the members of the library and we built up a list of overdue books per member -- no, that would not have nils, that would have an assoc entry (bob . nil). Hmmm... can you help me here, I cannot think of one.

You have not responded to the very similar example of tables. Correct, one cannot distinguish the absence of a key from a key added with the value nil. A use case is not a couple of lines of code proving that, a use case is a lookup on drivers license number during a traffic stop and...?

My point you may have missed is that without a use case not perfectly well expressed with either of the two solutions I offered above, the fundamental weakness is in offering a theoretical functional gap in a language as if it were a gap in a language.

Recall that pg is already on record as wanting things that are problems in practice, that this is a hacker's language, not a strait jacket language like C++ or Java with all sorts of theoretical angst over untyped variables.

The problem, btw, of putting an air bag in the trunk of a car is the added weight. :)


1 point by byronsalty 4373 days ago | link

In practice I imagine one would just use a known symbol like 'removed mid list/tree instead of blindly using a nil. But this is much less fun.


2 points by kennytilton 4372 days ago | link

Puzzled. Do you mean the nils would arise because non-nil entries would be changed to nil as they were (somehow) processed? Just delete the cons:

   (= (cdr prior-kill-me) (cdr kill-me))
instead of:

   (= (car kill-me) nil)
Otherwise you are forever iterating over the now irrelevant cons,


3 points by cchooper 4374 days ago | link

The downside of doing that is that you can't do nice simple things like

  (when (find 1 '(2 3 4)) (prn "Found 1"))
You would have to use match instead, but I agree there will always be cases where you do need that option.

I think defcon and match are a great idea, although I don't think you need all that power for what you're doing. Returning a uniq that indicates failure would be just as effective. defcon would be good as a standard way to create new types, and because they'd all work with match you'd gain a lot of power by following that convention. This kind of destructuring is one of ML's best features and I was temped to implement it for Arc myself.


1 point by absz 4374 days ago | link

I'm fairly inexperienced with pure functional programming à la Haskell and ML, but isn't this a weaker variant of tagged unions? Tagged unions are (in my mind) fantastic, but what you have allows only one tagged union in the entire program. It would be nice to have Haskell's "maybe" union, which could be (just x) or (nothing). Then you could also have, say, a binary tree type* which could be (node x) or (branch btree-left btree-right), and they would be different things.

* Yes, I know that lists can represent binary trees. Yes, that's often better. However, I needed an example.


1 point by offby1 4361 days ago | link

For what it's worth, mzscheme's "hash-table-get" solves this by giving you an optional "default" value -- if you omit the value, then you get an exception; if you specify the value, then you just get that value -- unless the value is a procedure, in which case it calls that procedure. The procedure can raise your favorite exception, or perhaps _insert_ something into the hash table, or ... anything you want. This seems like a nice balance between simplicity and flexibility.


1 point by vrk 4374 days ago | link

This reminds me of Perl 6, where you can return "undef but true" (to indicate a nil value that still evaluates to true in boolean context, roughly speaking), or Haskell Maybe monad. Don't ask me more about those; I don't really understand monads.


1 point by dark 4374 days ago | link

Yeah! I was going to suggest exactly this. I program daily in OCaml and I find extremely useful this way of pattern matching.

I would also suggest a way of doing static typing in parts of the code. I will explain: ML pattern matching usually leads to situations when all cases are covered, and there is no margin for error. that is, if I have a way to specify that ifhave receives only (have x) or (none) for some x, there is no way to reach the error condition