Arc Forum | I'm finding Arc's type system to be kind of random. I'm not sure has-a vs is-a ...

Arc Forum

7 points by kens1 6658 days ago | link | parent

I'm finding Arc's type system to be kind of random. I'm not sure has-a vs is-a is the solution, but here are my issues:

a) Lots of operations operate on "sequences" (tables, lists, and string). Lots of other operations operate only on lists, or only on string and tables. There doesn't seem to be any real pattern. There also doesn't seem to be any fundamental "sequence" abstraction that supports implementation of more complex operations. It's more like "if list, do this. If string, do that. If table, do that."

b) Lots of string operations treat a list of characters and a string interchangeably. Lots of operations don't. Again, it seems pretty random.

c) Non-fundamental types, such as queues (http://www.arcfn.com/doc/queue.html) don't interoperate at all. That is, a sequence operation such as map will do entirely the wrong thing to a queue.

I don't know what the solution is, but it seems like there are some type abstractions struggling to escape from arc.arc..

1 point by sacado 6658 days ago | link

I guess these issues will eventually be fixed. These are only bugs, not design issues.

As for is-a vs. has-a, I was wondering, isn't the biggest mistake considering that a piece of data only has one type ? That is, isa & type are broken ?

Considering values have a list of types (instead of a single type) would fix a few problems. For example, what is the type of nil ? Is it a symbol or a list ? Yes, it is both, but to do so, pg had to define the alist function ! (type nil) returns sym and cannot let you know nil is a list. What is '(1 2 3) ? Is it a cons or a list ? It is a list, but its type is only cons. type should return a list (it is already possible through annotate) and the definition of isa should be :

  (def isa (x typ)
    (mem typ (type x))

That way, we would have

  (type nil) -> '(list sym)
  (type '(1 2 3)) -> '(list cons)
  (type '(1 . 2)) -> '(cons)

And the system would remain very generic (even more than now). Thus, almkglor's scanners would work with each (provided the definitions of car and cdr were adapted) :

  (type s) -> '(cons scanner)
  (isa cons s) -> t

Other representations would be possible, including the ones we already discussed here. The name isa does not necessarily mean we have an is-a semantic (and not has-a). Once again, each item in type is an annotation and can mean many things that are left to the programmer (it can be a class, a class type, a bunch of functions, whatever).

But the "objects-have-one-type" model seems broken. At least for lists, which is quite annoying for a Lisp !

-----

2 points by almkglor 6657 days ago | link

"objects-have-one-type" model helps with nex3's defm:

  (defm something ((t f scanner))
    (do-something-on-scanner f))
  (defm something ((t f cons))
    (do-something-on-real-cons-cell f))

In this case, if we pass an object that is-a scanner and is-a cons, which method gets called?

This is the main reason I'm advocating is-a and has-a separation. We can say that an object is-a 'cons cell and has-a scanner. If something requires that an object is-a real, element-pointer-and-next-pointer 'cons cell, as opposed to somethingthing that requires that an object has-a 'car and 'cdr, we can make the distinction.

So we can say that an object is-a 'cons cell - it's what it really is, what it's implemented with. However, a 'cons cell has-a scanner interface, and if it's proper it has-a list interface, etc.

Separating is-a and has-a could also be useful for optimization of basic parts.

For example, a basic non-optimized diff algorithm might operate on has-a 'scanner, and use 'car and 'cdr operations. However a string-scanner is really just a wrapper around a string and an index into the string, as well as the string's length. Each 'string-scanner object contains three slots: one for the string, one for the index, and one for the length. This applies to each 'cdr on a string-scanner.

Now suppose we have a version of the diff algo which specifically detects if an object is-a string-scanner. It destructures the string-scanner into the string, index, and end, and instead of carrying around a triple of (string, index, length) it only carries the index, leaving string and length into local variables. This reduces memory consumption to only one-third.

(Note that the diff algo I posted a while back actually keeps entire sections of the list, in order to properly scan through their differences; that is, it keeps several scanners)

-----

5 points by sacado 6657 days ago | link

Hmm, I think there are 2 really different concepts here :

- type declaration of real-implementation (what you call "is-a"),

- type declaration in the sense of "capabilities" an object has (what you call has-a).

I think they should really be distinguished. The former is about optimized compilation, the latter about which functions can be applied to a given object.

But optimization is linked to variables (e.g. "in this block n always holds an integer, s always holds a string and l is always a cons) and does not need to be declared until you want to compile something.

On the opposite, capabilities are linked to values (e.g., "n, s and l are all scanners, they all have scanner capabilities, you can apply car and cdr to all of them. This is currently true, but could change if values referenced by n, s or l change). These are mandatory, and have to be known dynamically (this is not a declaration in the static meaning, they can even change later). When you apply car to a variable, you must know if its attached value can answer it (and eventually how).

  (= str (string-scanner "foo bar baz"))
  (type str)
  -> (scanner string)

  (def scan (s)
    (if (no s)
      ""
      (cons (foo (car s)) (cdr s))))

There, the values held by s are considered as a scanner and a string, that is, car and cdr can be applied to them. A dispatch algorithm is applied to them on the moment we need it. If, at any moment, an object held by s cannot be applied the method car or cdr, we have an error. Until we want more speed, that's enough.

Now suppose we want more. All we have to do is :

  (def scan (s)
    (istype string-scanner s
      (if (no s)
        (cons (foo (car s) (cdr s)))))

That way, for optimization purpose, we state that s only holds string-scanner objects. It does not even have to care with the annotations you added to the value (or values) held by s. If an object held by s is not really a string-scanner, well, anything could happen.

I might be wrong, but I think super-optimizing CL compilers work that way. You say them "optimize that function, and btw, this var always holds strings, don't even bother checking and dispatching the right function".

-----

4 points by almkglor 6657 days ago | link

Quite accurate. The main thing is that I think people should use has-a for everyday programming, and only use is-a if absolutely necessary, e.g. optimization.

My second proposal, probably lost somewhere in the confusion, is that has-a information would be connected to an object's is-a type.

-----

3 points by Jesin 6657 days ago | link

This could be part of a temporary solution:

  (def my-isa (obj typ)
    (let a type.obj
      (if atom.a
          (is a typ)
          (some [is _ typ] a)))

-----

3 points by sacado 6656 days ago | link

Yes, but then it does not work with the core functions (notably each & friends), as they are relying on isa. Renaming your function isa does not work either (that would be too easy) : atom and some call isa themselves, so you get in an infinite loop. Btw, obj is a macro (at least in Anarki, I don't know if it's present in the official Arc2), so your code has a red flag on it (although it does seem to work).

I tried this, it does work :

  (redef isa (x typ)
    (isa-rec type.x typ))

  (def isa-rec (types typ)
    (if
      (no types) nil
      (is types typ) t
      (isnt type.types 'cons) nil
      (is (car types) typ) t
      (isa-rec (cdr types) typ)))

  (isa nil 'sym) -> t
  (isa (annotate '(sym list) nil) 'sym) -> t
  (isa (annotate '(sym list) nil) 'list) -> t
  (isa (annotate '(sym list) nil) 'cons) -> nil

And now the funny part :

  (redef car (x)
    (if (isa x 'int)
      (if (> rep.x 0)
        'a
        nil)
      (old x)))

  (redef cdr (x)
    (if (isa x 'int)
      (if (> rep.x 0)
        (annotate type.x (- rep.x 1))
        nil)
      (old x)))

Both car and cdr now work on objects of type 'int (and objects of type 'cons, as before). If an object is both an int and a cons, its int being is taken into consideration. Every operation requiring 'cons cells or calling car and cdr can now be overridden to use ints instead (an int being a list whose car is the symbol 'a and whose cdr is that num - 1).

  (car 1)
  -> a
  (cdr 1)
  -> 0
  (cddr 2)
  -> 0
  (car (annotate '(int cons)) 3)
  -> a
  (len (annotate '(int cons)) 3)
  -> 3

Very funny things to do from there, but watch your fingers.

-----

2 points by sacado 6656 days ago | link

Hmm, Playing with this stuff, I ran into that :

  (each c (annotate '(int cons) 3) (prn c))
  -> Error: "Function call on inappropriate object #3(tagged (int cons . nil) 3) (0)"

The problem is that each calls acons and alist, but they are defined in terms of (is (type x) 'cons) instead of (isa x 'cons). Once you redefine them, it works fine.

pg, do you still accept suggestions about the core functions ? Shouldn't acons and alist be defined with isa instead of is ? That would let us redefine them more easily. Btw, what do you think of all these discussions about types ?

-----