Arc Forumnew | comments | leaders | submitlogin
A curated arc repo
3 points by akkartik 4895 days ago | 7 comments
http://github.com/akkartik/arc

Could y'all run your client code on it and tell me what stuff from your repos or anarki you find yourself missing? I want to leverage everything people have submitted to this forum and to anarki over 5 years, but only tricks and helpers you actually use.

== What it has right now

0. rntz's extensible coerce.

1. aw's $.

2. Generics. I've tried to use generics everywhere in arc.arc that I can[1]. An interesting consequence of this: defgeneric now dispatches on its _final_ arg. This allows me to simplify some, keep, rem, any.[2]

3. Lists no longer end with nil. You can still say nil but it basically means (). I've been using this for months without issues - you just see () at the repl instead of nil. The major incompatibility this introduces is that () is no longer a symbol, which in turn gets rid of the bug around downcase (http://arclanguage.org/item?id=10793) and fixes upcase differently than pg did.

I wasn't sure what type to make (), and left it at (). In both lisp and scheme the type of nil is a specialcase. Since () isn't a cons, (coerce () 'cons) now actually returns a non-cons. I'm more uncertain about this, so let me know if it feels like a gotcha. While I was building defgeneric I ran into an infinite loop repeatedly trying to coerce () to cons.

[1] Basically I went through looking for "(if (alist" as a marker for ugly function implementations that try to specialcase string processing.

[2] I'd like to make map generic but it's a weird case. It takes a variable list of args, but not all of them are sequences. Looking at the code it seems there's only one case where we actually use this 'feature' - in the definition of rotate. I'm mulling turning map1 into map. Thoughts? Reactions? Howls of outrage?



3 points by waterhouse 4895 days ago | link

'sort dies:

  arc> (sort < '(2 3 1))
  Error: "reference to undefined identifier: _copylist"
I think (coerce () 'cons) should throw an error. If you were to make () into a cons cell, what would you put in it? Arc and Common Lisp agree with me:

  * (coerce () 'cons) ;SBCL

  debugger invoked on a SIMPLE-TYPE-ERROR:
    The length requested (0) does not match the type restriction in CONS.
Regarding nil vs ()... I'm guessing you switched to () for ease-of-implementation reasons. For example, currently some Scheme primitives like hash tables that contain Arc lists get printed with a bunch of (blah . nil), and the lists created in rest parameters are ()-terminated and considered different from nil-terminated lists by hash tables.

First, here's a list of my recommendations: (is nil 'nil) -> t [that might be hard; have to compile (quote nil) into ()], (type nil) -> 'sym [easy], make nil get printed as "nil" [that might be hard, though I wouldn't demand perfection immediately], and [as mentioned above] (coerce nil 'cons) -> error [easy]. In other words, make 'nil operate exactly as it does in official Arc.

I will now talk in great detail about the semantics, implementation, and usage of nil.

nil is defined to be the empty list. In Scheme, that's all nil is good for (they call it null, actually, and write it as ()). I'm going to reject this system with some brief comments about why it sucks: it's conceptually wasteful (you have to invent a whole new category of object, instead of using a symbol), and it makes things horribly and unnecessarily verbose. I imagine most users of Arc feel the same way. Related humor: http://pastebin.com/raw.php?i=TQNh668k

In Common Lisp and Arc, nil is represented as the symbol 'nil. But here's an idea: why not make nil be a cons cell whose car and cdr point to itself? That seems like a very elegant way to terminate lists; then you don't need to invent anything new or give special status to a certain symbol. Also, it makes (cdr nil) = nil, which I find nice and useful. I think there is historical precedent for this--if nothing else, it's how I implemented linked lists in C recently.

There's a problem, though. Cons cells are used for things other than flat lists, like trees or argument lists. Given the other uses of nil, as boolean false and as the symbol 'nil (which I like very much and take as a given), you are likely to have nil as an element of one of these structures. But if you're treating nil as an element of a structure built of cons cells, then you really don't want (acons nil) to be true.

In Common Lisp and Arc, (cdr nil) = nil, and (acons nil) [or (consp nil) in CL] is false. This suggests that nil is some sort of trick object that acts like a narcissistic cons cell when you ask for its cdr, but an atom (specifically, a symbol) when you ask for its type. This brings to mind two questions: 1) What is nil, and 2) Do we want a trick object like that in our language?

To answer (1), it really doesn't matter what nil is, but rather what interface the programmer sees. The programmer sees that nil prints to "nil", it is treated as false in conditionals, it terminates lists, (car nil) = (cdr nil) = nil, (acons nil) is false, it is equal (by the "is" comparison) to the symbol 'nil, and (type nil) -> 'sym. (I think that's everything.) If all these things are true, then nil could be internally represented as the string "ACHTUNG"--or, in your case, as the Scheme empty list ()--and the programmer wouldn't know it and wouldn't care.

I answer (2) with a definite yes. Here's a train of thought: When you deal with trees, nil is the symbol 'nil and isn't a cons. When you deal with lists, nil really should be a self-directed cons cell. The cdr operation is implemented so that it special-cases, saying "If this is nil, then return nil, otherwise extract the cdr of this cons cell and follow the pointer".

But when your program is compiled into perfect machine code (by type inference or help from the programmer), in the list case, it will represent nil as the self-referencing cons cell, and it will generate a raw unchecked cdr that just says "Extract the cdr of this cons cell and follow the pointer". In the tree case, it will represent nil as a symbol, and it will still generate a raw cdr, because the cdr will only be used on cons cells; and the function 'acons won't need to special-case "Check if this 'cons cell' is nil, and if it is, return false anyway".

nil is an abstraction over a couple of different internal representations of things that English speakers could call "nil", in the same way that Lisp integers are abstractions over the distinct "fixnum" and "bignum" representations of mathematical integers. It's little more of a trick for cdr to dispatch on nil when it might be represented as a symbol, than for + and * to dispatch on numbers which might or might not be (and whose sum and product might or might not be) fixnums.

So nil isn't a trick object: when you're working with lists, it is the solid, reliable terminator of lists, and in other contexts, it is the solid, reliable symbol 'nil. I recently wrote some code to handle polynomials, and nil was the same as the polynomial 0. I also played with streams a long while back, and I used nil as the empty stream.

This model of nil breaks down only when you try to deal simultaneously with two or more categories of object, which would each have their own kind of nil. More specifically, when you perform an operation that could return either of two or more kinds of nil. For example, (read) may return nil because it read the symbol 'nil or because it encountered an EOF character. These cases are relatively rare, and they're not hard to deal with when they do come up. And when you do deal with them, you probably write the same code you would have written had you been forced to do it from the start.

The read functions generally carry an optional "eof" argument, which you can bind to whatever sentinel value you like (such as a gensym, or the fail* global variable). Hash table lookups carry an optional "if not found" argument. I think these things model a good way to deal with nil: provide it as the default, with an easy way to make it more specific.

(At the moment, hash tables can't hold the value nil anyway, as (= ht.x nil) wipes the entry for the key x in the table ht. The optional thing is probably more for convenience, as in (++ (count-tb x 0)).)

See also: http://unintelligible.org/onlisp/onlisp.html#SEC101

-----

1 point by akkartik 4895 days ago | link

Thanks waterhouse. I'll fix the sort bug and change coerce and think more about nil. If anybody has code examples where the current nil would be non-optimal, let me know.

Update: I've fixed (is nil 'nil) locally and will push it out. But ''nil still isn't nil. It seems this is the same on sbcl. Is this desirable or a wart?

-----

2 points by waterhouse 4894 days ago | link

That is desirable. ''nil is the same as '(quote nil), and it really should evaluate to the list (quote nil), which will be distinct from the atom nil. 'nil should be the same as nil for approximately the same reason that t is the same as 't--because t is globally bound to the symbol 't. (It doesn't actually work the same way with nil, as the compiler treats it specially instead of having a global binding, but it looks the same to the programmer.)

Also, the definitions of x-set-c[ad]r in ac.scm check for set-c[ad]r! every time they are called. I talk about this and how to fix it here: http://arclanguage.org/item?id=12292

-----

2 points by akkartik 4894 days ago | link

Thanks, I'll add this. All your previous suggestions are in except changing type.nil, but I've been holding back on pushes because my local repo's been unstable for the past 36 hours.

-----

1 point by akkartik 4893 days ago | link

Update: pushed.

-----

1 point by rocketnia 4893 days ago | link

IMO, the cleanest way to handle nil is for it to be a separate type and for 'car and 'cdr to be generic (or at least ad-hoc generic), defined for the nil type as well as the cons type. This way, when extending a generic function, you don't have to put the empty list behavior and the symbol behavior in the same branch (which would make it harder for a programmer to extend a list function for symbols or vice versa).

Hmm, it sounds like that's what you had before waterhouse's input. :-p

In Lathe's generic/inheritance utilities (orc/orc.arc and orc/oiter.arc), I base the inheritance graph on a new type function, 'otype, mainly so that I can have a different type for nil. Because of the inheritance system, I also get to have a list type that both the nil type and the cons type inherit from, so that I can extend every list case in a single branch.

As for (coerce nil 'cons), I think that's the wrong API. It should shave some tokens to become listify.nil, and it shouldn't raise an error. Who really needs to coerce things into nonempty lists? ^_^

-----

3 points by akkartik 4893 days ago | link

Hmm, it sounds like that's what you had before waterhouse's input. :-p

:) I think I still have that behavior. All I did was to fix coerce and 'nil. You can defgeneric without the acons check and defmethod on nil (or vice versa). It feels more 'pure', but I don't have any examples where it's a big win. It's _truly_ a win that you can simply replace def with defgeneric and get a superset of behavior without losing compatibility.

-----